Glasswing Has a Crack: What the Mythos Leak Tells Us About Controlled AI Releases

Three days ago, Gizmodo reported that an unidentified group is using Claude Mythos without Anthropic’s permission. Anthropic has confirmed the access. That sentence — short, factual — is the most consequential thing said about frontier AI this month, and it lands at the worst possible moment for the controlled-release thesis the entire industry has been quietly converging on. I’ve been sitting with this since the headline broke, and I think the security community is underreacting.

Here’s where my head is at.

What Mythos actually does that Opus 4.6 didn’t

The capabilities gap between Mythos Preview and Opus 4.6 isn’t incremental — it’s a regime change, and I want to anchor on the numbers before getting to the governance angle.

Anthropic’s own red-team writeup is the cleanest source: on their internal vulnerability-discovery benchmark, Opus 4.6 generated zero crashes at tier 3 across roughly 175 attempts. Mythos Preview produced 595 crashes at tiers 1 and 2, added crashes at tiers 3 and 4, and chained an exploit that an unmodified Opus 4.6 reportedly couldn’t develop in hundreds of tries. The UK AI Security Institute independently saw the same pattern: 73% success on expert-level capture-the-flag tasks (no model could complete any of these before April 2025), and on AISI’s 32-step “The Last Ones” enterprise-network attack range, Mythos Preview is the first model to solve it end-to-end — three out of ten attempts — averaging 22 of 32 steps. Opus 4.6, the next best, averaged 16.

That last figure is the one I keep coming back to. Multi-host, multi-stage attack simulation that human professionals estimate at 20 hours of work, completed start-to-finish by a language model. We are no longer arguing about whether AI can meaningfully assist offensive security. The argument now is about distribution.

Project Glasswing was supposed to be the answer

Anthropic’s response to those numbers was to not release Mythos to the public. Instead, they stood up Project Glasswing — a controlled-access program that, as best I can piece together from public reporting, hands Mythos to a hand-picked set of critical-infrastructure operators and large platforms (Apple, Google, Microsoft, Cisco, Amazon are all named) plus a small set of open-source defenders. The pitch is straightforward: give the people defending the most important systems a head-start over the eventual day when models of this caliber are widely available.

I find the logic basically sound. If you accept that capabilities of this kind will inevitably proliferate — and I do — then a 6-to-18-month defender lead is genuinely valuable. It’s the AI-safety equivalent of disclosing a critical CVE to vendors before going public.

The problem is that Project Glasswing’s threat model assumed the only people with access were people Anthropic gave access to.

The leak changes the math

We don’t yet know the shape of the unauthorized access. Is it credential theft from a Glasswing partner? An insider at Anthropic? A weights exfiltration? An API-key compromise? Each of those implies wildly different remediation, and Anthropic has been understandably tight-lipped while they investigate. But the existence of the access — confirmed, not just alleged — does two things:

First, it collapses the “defender head-start” argument from a temporal advantage into a race condition. If an unknown actor has had Mythos for some unknown number of weeks, the defenders who got it on April 8 may already be behind, not ahead. We don’t know what’s been done with it.

Second, and this is the part I think is being undersold: it sets a precedent. The next time a frontier lab argues that a model is too dangerous to release publicly but acceptable to share with twenty named partners, the counter-argument is now empirical, not hypothetical. “Controlled access leaks” stops being a thought experiment.

What I think MSPs and IT teams should actually do this week

I run a small managed-services practice, so I’ll keep this concrete. None of us are getting Mythos Preview through Glasswing. That doesn’t matter — the implications still land on our desks.

Patch hygiene gets re-prioritized. AISI is explicit that Mythos succeeds against systems with weak posture and gets stuck against well-defended ones. The line between those two states is mostly your patch SLA, your egress controls, and whether you have any meaningful EDR coverage on the endpoints attackers actually land on.

Detection-engineering for agentic behavior moves up the list. The kill-chain fingerprint of a language-model attacker — long pauses for reasoning, retries on noisy commands, oddly verbose error-handling — is observable, and it’s not what your SIEM rules were tuned for. I’m going to spend some of next sprint on this.

Secrets rotation gets a second look. If the Mythos leak turns out to be credential-driven, a lot of vendors are about to discover that “service account password unchanged since 2021” is a finding now.

What I’m watching

Three things, in order. Whether Anthropic discloses the vector of the unauthorized access (vs. only the fact of it) — that determines whether Glasswing partners need to assume their own footholds are compromised. Whether AISI’s planned hardened-environment evaluations land before another model in this tier ships. And whether the EU and UK regulators treat the leak as evidence that voluntary controlled-release programs need a statutory backstop, or whether they let the labs self-correct.

This is the part of the story where the policy moves faster than the model improvements, or it doesn’t. Either outcome is informative.

What Mythos actually does that Opus 4.6 didn’t

Project Glasswing was supposed to be the answer

The leak changes the math

What I think MSPs and IT teams should actually do this week

What I’m watching

Sources

Comments

Leave a Reply Cancel reply

More posts

Building a Secure Router Config Backup System with Google Antigravity and Azure Key Vault

Who Watches the AI Agents? — Cisco’s Case for Agentic Observability

Cisco’s “AI-First Ops” Pivot — Why Production AI Is an Infrastructure Problem

The Patch Deficit: One Month Into Mythos, Less Than 1% Has Been Fixed