Anthropic Project Glasswing: when the safety lab builds the threat
In April 2026, Anthropic released what it described as its most dangerous model to date, identifying thousands of vulnerabilities across major software products including OpenBSD, FFmpeg, and the Linux kernel. It launched Project Glasswing the same day, a private 40-company coalition built to defend against the capability Anthropic had just produced. Not governments. Not regulators. The same actor produced the threat and selected the response.
On 7 April 2026, Anthropic released Claude Mythos Preview. Casey Newton at Platformer reported the company described it internally as its most dangerous model to date. Alongside the release, Anthropic launched Project Glasswing: a coalition of more than forty technology companies given early access to Mythos to find and patch critical vulnerabilities before public disclosure. Anthropic allocated $100 million in usage credits to coalition members and $4 million in additional funding to open-source security work.
Within its first weeks of access, Mythos identified thousands of high-severity vulnerabilities. Newton reported a 27-year-old flaw in OpenBSD, an FFmpeg vulnerability that had survived 5 million automated tests, and multiple Linux kernel vulnerabilities among them. The defensive coalition includes Apple, Google, Microsoft, Cisco, and Broadcom.
The justification Anthropic offered is the same one that defines the project: the only way to defend against a dangerous AI capability is to build it first. The justification is also the case to be made against the project. A private company now holds nation-state-grade offensive cybersecurity capability and decides — under its own governance — which forty companies see the output and which do not.
Verified Anthropic released Claude Mythos Preview on 7 April 2026 and launched Project Glasswing on the same day. Both are confirmed by Anthropic's own announcements and reported in detail by Casey Newton at Platformer.
Verified Project Glasswing is structured as an invitation-only coalition of 40+ technology companies. Anthropic allocated $100 million in usage credits to coalition members and $4 million to open-source security funding. Coalition membership includes Apple, Google, Microsoft, Cisco, and Broadcom.
Verified Within initial deployment, Mythos identified thousands of high-severity vulnerabilities across major operating systems and browsers. Newton documented specific examples: a 27-year-old flaw in OpenBSD, an FFmpeg vulnerability that had survived five million previous automated tests, and multiple Linux kernel vulnerabilities. Anthropic described the model internally as its most dangerous to date.
Probable Anthropic briefed CISA and the Center for AI Standards and Innovation before launch. Newton reported it is not clear the government has taken Anthropic up on the offer to evaluate Mythos. The same government is in an active legal dispute with Anthropic over Pentagon contract terms documented in BC-001.
Probable Kelsey Piper, quoted in the Platformer piece, observed that a private company now holds incredibly powerful zero-day exploits of almost every software project of significance, and that the incentives to steal Anthropic's model weights have risen accordingly. The first half is verifiable from the project description. The second half is structural inference from a credible analyst, reported but not independently corroborated.
Unverified Reporting suggests the criteria for coalition admission and the criteria for vulnerability disclosure are governed by internal Anthropic policy. The policy is not public. The decision-making process is not externally auditable.
The defensive justification for Glasswing is real. Adversarial actors with access to Mythos-class capability would be a serious threat. Building the defensive capability inside a coordinated private response is, in narrow terms, faster than waiting for public defensive tooling to catch up. That argument is the project's case for itself.
The structural problem is that every element of the defence is private. The model is built by Anthropic. The coalition is selected by Anthropic. The credit allocation is decided by Anthropic. The vulnerability findings are disclosed on a schedule set by Anthropic. The criteria distinguishing "responsible defensive work" from "private offensive capability accumulation" are written by Anthropic. There is no external body with statutory authority over any of those decisions.
This is the Broken Control Loop in its cleanest documented form. The safety justification for building a dangerous capability is structurally identical to the danger itself. The same actor produces the threat, identifies it, decides who gets to know about it, and decides what counts as adequate response. Every checkpoint in the loop is owned by the same entity.
Anthropic's defenders argue the alternative is worse. A state actor or organised criminal group developing Mythos-class capability in secret, without a defensive coalition, would produce concentrated catastrophic risk. Public regulators do not move at the speed of frontier AI. Coordinated private action is the only available response, and Anthropic has the most developed Responsible Scaling Policy in the industry to do it credibly. They warned about the danger before releasing the tool, which places them ahead of competitors releasing similar capability without disclosure.
The argument is correct on facts and incomplete on structure. Private coordination is a stopgap, not a governance model. Glasswing's existence is evidence that regulation has failed, not evidence that private coordination is sufficient. Anthropic's own published position has consistently called for stronger AI regulation. The same Anthropic remains in unresolved dispute with the US government over Pentagon contract terms, a reminder that the actor coordinating private cybersecurity defence is in active conflict with the body that would notionally oversee it.
Capability centralisation. A single private company now holds and brokers exploit knowledge across most of the global software stack. The forty coalition members benefit. Software vendors outside the coalition do not. End users have no view into either side of that line.
Theft incentive. The value of Mythos's model weights to a state-level adversary has risen in proportion to the capability demonstrated. Anthropic carries the security burden for assets whose successful exfiltration would represent a national-security event. Public oversight of that security posture does not exist.
Regulatory displacement. Glasswing fills a governance gap with a private substitute. Filling the gap reduces the political pressure to legislate. The longer Glasswing operates, the less urgent statutory frameworks appear, and the more entrenched the private alternative becomes.
Government relationship break. The same government negotiating Pentagon contracts with Anthropic (and attempting a supply-chain risk designation against the company, per BC-001) is reportedly not engaging with Anthropic's offer to evaluate Mythos. The most consequential current AI safety work is being done in active legal conflict with the body that would notionally oversee it.
An independent body — statutory or treaty-based — with authority to audit coalition membership criteria, vulnerability disclosure timelines, and model weight security would convert a private arrangement into a governed one. Glasswing's defenders would lose nothing if such a body existed. The case for the project is stronger, not weaker, with external scrutiny attached.
Transparency about the coalition admission process and the criteria distinguishing defensive use from offensive capability accumulation would close the largest single information gap. Anthropic's own published Responsible Scaling Policy already commits the company to external evaluation of frontier capabilities. Extending that commitment to Glasswing would be consistent with stated policy.
Mandatory model-weight security audit by a body independent of Anthropic and of any coalition member would address the theft-incentive risk without exposing the underlying intelligence. The audit standard exists in adjacent industries. Importing it is a governance task, not a technical one.
Project Glasswing is a defensible response to a real threat. But it is a response built by a company against a threat it produced. The case for the project is sound. The case for the structure is not.
Primary reporting. Casey Newton, "Why Anthropic's new model has cybersecurity experts rattled" — Platformer, 7 April 2026. platformer.news
Expert commentary. Kelsey Piper, quoted in the Platformer piece on power centralisation and theft incentive structure.
Anthropic primary statements. Project Glasswing launch communications and Claude Mythos Preview model card, referenced and partially quoted in the Newton piece.
Cross-reference. BC-001 — Anthropic vs the Pentagon for the government relationship context. BC-002 — Anthropic DMCA Takedown for the corporate conduct pattern.
QUESTIONS
What is Anthropic Project Glasswing?
Project Glasswing is a coalition of more than 40 technology companies launched by Anthropic on 7 April 2026 alongside the release of its Claude Mythos Preview model. Coalition members were given early access to Mythos to identify and patch critical vulnerabilities. Anthropic allocated $100 million in usage credits to coalition members and $4 million in additional funding to open-source security work. Apple, Google, Microsoft, Cisco, and Broadcom are among the confirmed members.
Why is Anthropic Mythos considered dangerous?
Anthropic described Mythos internally as its most dangerous model to date because of its demonstrated ability to identify high-severity software vulnerabilities at industrial scale. Within initial deployment, the model identified thousands of vulnerabilities across major operating systems and browsers, including a 27-year-old flaw in OpenBSD and an FFmpeg vulnerability that had survived five million previous automated tests. The same capability that defines its defensive value is the capability that defines its offensive risk.
What is the Broken Control Loop and how does it apply to Project Glasswing?
The Broken Control Loop is a BrokenCtrl framework describing what happens when the same entity produces a risk, identifies it, decides who is informed about it, and decides what counts as adequate response — with no external authority over any of those steps. Glasswing is the cleanest documented instance: Anthropic builds the model, selects coalition members, allocates credits, and sets vulnerability disclosure schedules under its own internal governance. Read Framework 01 in full →
How does Project Glasswing relate to Anthropic's Pentagon dispute?
The same period that saw Glasswing launch was the period in which Anthropic was in active legal conflict with the US government over Pentagon contract terms — the case documented at BrokenCtrl as BC-001. The same government that attempted a supply-chain risk designation against Anthropic was reportedly not engaging with Anthropic's offer to evaluate Mythos. The two cases together describe a governance environment where the most consequential AI safety work is being done in active institutional conflict with the body that would notionally oversee it.
Is private cybersecurity coordination on Mythos-class AI a sustainable governance model?
The argument for private coordination is that no public alternative currently moves fast enough to defend against frontier AI capability. The argument against private coordination is that filling the governance gap with a private substitute reduces the political pressure to legislate, while concentrating governance authority inside the company that built the threat. Coordinated private action is a defensible stopgap. As a permanent governance model it is not sustainable. BrokenCtrl's editorial position is that the existence of Glasswing is evidence the regulatory environment has failed to keep up — not evidence that private coordination is sufficient.