It has been roughly two weeks since Anthropic pushed Claude Mythos Preview out the door under the Project Glasswing banner, and the news cycle has finally moved past the initial “holy cow, it found a 27‑year‑old OpenBSD bug” phase. What is showing up now is more useful to me as a practitioner: an independent capability eval, a concrete defender playbook, and a first real debate about whether Anthropic’s chosen governance model is the right one. I want to walk through what I am taking away.
The AISI Scorecard
The most important document this week did not come from Anthropic. It came from the UK AI Security Institute, which ran Mythos Preview through capture‑the‑flag ladders and a 32‑step corporate network simulation called “The Last Ones” that AISI estimates takes a human red teamer about twenty hours to complete. Mythos Preview cleared expert‑level CTFs at a 73% success rate — a tier that no model before April 2025 had completed at all — and it finished TLO end‑to‑end in three of ten attempts, averaging 22 of 32 attack steps versus Opus 4.6’s 16.
What grabbed me is the bit AISI added at the end: performance “continues scaling beyond the tested 100M token budget.” The runway is not tapped. AISI was also careful to note that the test environments lacked active defenders, detection tooling, and alert penalties — exactly the kind of friction a real SOC produces. I read that as a useful caveat in both directions. The absolute ceiling on offense is lower than the lab numbers suggest against a well‑defended target, but the gap between lab and production is also where most enterprises live, and most of them are not well‑defended.
The Defender Playbook Takes Shape
Barracuda put out the first vendor write‑up I have seen that actually translates the Mythos capability gains into operational guidance, and it is refreshingly boring. Their threat model is that Mythos does not invent new attacks; it collapses the time between disclosure and active exploitation, and it scales vulnerability discovery in ways that were previously human‑bounded. Their prescription is what you would already tell a client in 2026: up the frequency of vulnerability scanning, automate patching, squeeze the exposed attack surface with Zero Trust Network Access and WAFs, tighten segmentation, deploy phishing‑resistant MFA, and — this one matters — actually rehearse your incident playbooks and verify your backups restore.
None of that is novel. What is novel is the argument that boring hygiene is now the load‑bearing control. Anthropic’s own disclosure post echoes the point, recommending defenders “use generally‑available frontier models” like Opus 4.6 today for vuln discovery in their own code, rather than waiting for Mythos access. If your patch SLA is measured in weeks and your asset inventory is a spreadsheet, Mythos is not your problem; your program is. Mythos just makes the consequences of that program arriving sooner.
Glasswing’s Asymmetry Problem
The loudest critique this week came from CounterPunch, and while the tone is sharper than I would write myself, the underlying question is a fair one. Anthropic’s rollout gives a small set of “critical industry partners and open source developers” early access to a model that can, per Anthropic’s own numbers, produce working remote code execution exploits overnight against major browsers and operating systems. Over 99% of the vulnerabilities the model has found are unpatched. The responsible disclosure pipeline — cryptographic SHA‑3 commitments, human triagers, measured release to maintainers — is thoughtful, but it is also unambiguously a private process run by a private company.
The piece flags Binoy Kampmark’s framing of “manufacturing the danger and the cure,” and quotes engineer Bulatova Alsu on the idea that “the more we restrict a capable agent, the less predictable its behaviour becomes.” I think the first critique lands harder than the second. The pattern where a frontier lab produces a dual‑use capability and then becomes the gatekeeper of who gets to use it defensively is a governance posture we have not actually debated in public. It happens to coincide with Anthropic’s commercial interests. That does not make it wrong, but it makes it a choice, and choices deserve scrutiny.
What the March Leak Actually Told Us
It is worth remembering how we got here. Mythos did not arrive via a scheduled launch post — Fortune broke the story on March 26 after a CMS misconfiguration exposed roughly 3,000 unpublished assets, including the draft announcement and the internal framing. The leaked draft said Mythos is “currently far ahead of any other AI model in cyber capabilities” and “presages an upcoming wave of models that can exploit vulnerabilities in ways that far outpace the efforts of defenders.”
That language was written before Anthropic had to defend it in public. Read in that light, Glasswing is not just a rollout strategy; it is the operational answer to a threat model Anthropic already believed internally. The framing I keep coming back to is that Anthropic is racing itself. The same leak that revealed Mythos existed also revealed Anthropic’s read that the next wave is worse, and that the defender side has to be seeded before the general availability clock runs out.
What I’m Watching
Three things on my list for the next fortnight. First, whether AISI or another independent lab runs Mythos against an environment with actual defenders instrumented — that is the number I actually need to brief clients on. Second, which industries outside the obvious cloud and browser vendors end up inside Glasswing; the composition of the partner list is the real signal on whose software Anthropic thinks matters. Third, whether any of the “over 99% unpatched” findings start dropping as coordinated disclosures in May. That is when we will learn whether the responsible‑disclosure pipeline scales, or whether it bottlenecks on human triagers and becomes the story.
In the meantime, the only recommendation I give clients this week is the one Anthropic itself gave: stop waiting. Opus 4.6 will find plenty of bugs in your code today.
Sources
- Claude Mythos Preview — red.anthropic.com
- Our evaluation of Claude Mythos Preview’s cyber capabilities — UK AISI
- Anthropic’s Claude Mythos: What organizations should do now to boost cyber resilience — Barracuda
- Putting the Calamity Makers in Charge: Anthropic and Claude Mythos Preview — CounterPunch
- Anthropic ‘Mythos’ AI model representing ‘step change’ — Fortune
