How Durable is Our Visibility into AI Cyberattacks?

1 minute read

Published: June 29, 2026

Our ability to monitor AI-enabled cyberattacks may be far more fragile than it looks. Today we get useful signal about how attackers use AI, but much of that visibility could erode as adversaries grow more sophisticated and move off detectable channels.

In this post I map out the six main sources of threat intelligence we currently rely on—controlled capability evaluations, real-world vulnerability discovery, open-source pen-testing tools, monitoring of the underground ecosystem, incident forensics, and LLM API usage data—and assess how durable each one is. The worrying pattern is that several of these sources are at risk of diminishing as attackers stop using API-hosted LLMs in favour of self-hosted models and improve their operational security.

A further problem is that much of the critical data sits with private-sector actors—security firms, model providers, and network defenders—who share findings based on commercial incentives rather than public benefit. To avoid losing visibility into cutting-edge AI cyberattack capabilities exactly when it matters most, I argue governments should fund independent threat-intelligence collection and sharing, develop AI-specific forensic capabilities, build the capacity to evaluate models on real-world tasks, and establish an “Agentic Cybersecurity Exchange” that aggregates signals across infrastructure operators.

Read the full post on Attack Surface

Share on

X (formerly Twitter) Facebook LinkedIn

Will we get automated alignment research before an AI Takeoff?

1 minute read

AI may automate large parts of AI R&D within the next decade, dramatically accelerating progress. A crucial question for existential risk is the ordering: will automation speed up capabilities research or safety research first? If capabilities race ahead while safety lags, we could find ourselves with very powerful systems and no commensurate ability to make them safe.

Safety Cases Explained: How to Argue an AI is Safe

16 minute read

Safety Cases are a promising approach in AI Governance inspired by other safety-critical industries. They are structured arguments, based on evidence, that a system is safe in a specific context. I will introduce what Safety Cases are, how they can be used, and what work is being done on this atm. This explainer leans on Buhl et al 2024. At the end, I survey expert opinions on the promise/weaknesses of Safety Cases.

A Call for Better Risk Modelling

8 minute read

TL;DR: The EU’s Code of Practice (CoP) mandates AI companies to conduct state-of-the-art Risk Modelling. However, the current SoTA is has severe flaws. By creating risk models and improving methodology, we can enhance the quality of risk management performed by AI companies. This is a neglected area, hence we encourage more people in AI Safety to work on it. Work on Risk Modelling is urgent because the CoP is to be enforced starting in 9 months (Aug, 2, 2026).

Jan Wehner

How Durable is Our Visibility into AI Cyberattacks?

Share on

You May Also Enjoy

Will we get automated alignment research before an AI Takeoff?

Should the AI Safety Community Prioritize Safety Cases?

Safety Cases Explained: How to Argue an AI is Safe

A Call for Better Risk Modelling