Anthropic Built an AI So Dangerous Even They Won't Release It

The AI safety company just built something so dangerous it won't let you use it—but the NSA already is.

The Summary

Anthropic released Mythos, an AI model specialized in finding software vulnerabilities, but restricted access to select parties because of its offensive capabilities
The NSA is testing Mythos to find security flaws in Microsoft products and other popular software, making it one of the "carefully chosen parties"
This marks a new category: AI models too powerful for public release, raising questions about who decides what "responsible hands" means

The Signal

Anthropic has built an AI that's exceptional at one specific task: breaking software. Mythos doesn't write code, doesn't chat, doesn't generate images. It hunts for vulnerabilities in computer systems with enough skill that Anthropic concluded a public release would be irresponsible. If weaponized by attackers, the company says, it could accelerate data theft and infrastructure disruption at scale.

The technical capability is one thing. The access decision is another. The NSA is already using Mythos to probe Microsoft software for security holes. That makes the intelligence agency both customer and validator for Anthropic's risk assessment. The NSA gets early access to an offensive security tool. Everyone else gets locked out.

"An AI safety company just gated the most powerful vulnerability scanner behind institutional approval."

This isn't the first restricted AI release, but it's the first where the restriction is purely about capability, not alignment. Previous limited releases focused on preventing harmful content generation or jailbreaks. Mythos works exactly as designed. The danger is in what it does well, not what it does wrong.

Key dynamics at play:

Dual-use tools where "good guys" and "bad guys" use identical capabilities
Private companies deciding which institutions get offensive cyber tools
Security researchers locked out of the most advanced vulnerability detection

The precedent matters more than the model. If Mythos-class tools become the norm, cybersecurity splits into haves and have-nots based on institutional relationships with AI labs. Independent researchers, smaller companies, and open source projects get left behind. The NSA and Microsoft get a partnership that finds flaws before adversaries do.

The Implication

We're entering the era of capability-gated AI. Not every model will be safe to release, even if it works perfectly. Watch for more tools that perform legitimate functions but carry asymmetric risk. Vulnerability research, exploit development, social engineering, influence operations—these are all domains where AI capability could outpace our ability to distribute it safely.

For anyone building security infrastructure: your threat model just changed. Assume adversaries will eventually access Mythos-tier tools through leaks, independent development, or espionage. The gap between institutional and criminal capability is closing faster than your patch cycle.

Sources

Bloomberg Tech

The Summary

The Signal

The Implication

Sources

Keep Reading