Anthropic's Most Dangerous AI Leaked to Random Discord Users for Weeks

The AI safety crowd keeps warning us about rogue superintelligence, but it turns out the real threat vector is a Discord server and some bored kids with time on their hands.

The Summary

Unauthorized users accessed Anthropic's Claude Mythos — a model the company itself labeled too dangerous for public release due to cyberattack capabilities — on April 7th, the same day Anthropic announced limited testing
The group operates through a private Discord channel dedicated to finding unreleased AI models, and they've had regular access for weeks
Other unreleased Anthropic models have also been compromised by the same group, pointing to a systemic security problem
The gap between Anthropic's public warnings about AI danger and their actual operational security is now a matter of public record

The Signal

Here's the timeline that should make every AI safety executive nervous. On April 7th, Anthropic announced Mythos would go to a limited number of companies for controlled testing. Same day, a handful of Discord users got their hands on it. Not sophisticated nation-state actors. Not a coordinated hack. Just people in a chat room who apparently hunt for unreleased models as a hobby.

The unauthorized group has been using Mythos regularly since then, though Bloomberg's source says not for cybersecurity purposes. That's cold comfort. The whole reason Anthropic flagged Mythos as dangerous is its potential to enable cyberattacks. Whether the Discord crew is using it to write poetry or exploit zero-days is beside the point. The capability is loose.

"So on the one hand, Anthropic itself is the one describing Mythos as a dangerous national security threat. On the other hand, their own security is so sloppy that rando hooligans on Discord have had access to Mythos."

This isn't a one-off breach. Bloomberg reports the same group has accessed other unreleased Anthropic models. Pattern, not incident. Which raises questions about how Anthropic gates access to anything. If your threat model assumes careful distribution to vetted partners, but your actual distribution includes anyone who knows where to look, your threat model is fiction.

The irony here cuts deep. AI labs have spent years lobbying for regulatory frameworks built around the idea that they're responsible stewards of dangerous technology. Voluntary commitments. Self-governance. Trust us, we know what we're doing. Then a few Discord users demonstrate that the security posture is more "honor system" than "Fort Knox."

Key contradictions:

Anthropic publicly positions Mythos as a national security concern
Unauthorized access happened the same day as the official limited release announcement
The breach persisted for weeks without detection or remediation

The Implication

If you're building on or investing in frontier AI, this should recalibrate your risk assessment. The danger isn't just what these models can do in theory. It's what happens when companies can't control who gets them in practice. Every AI lab talks about responsible deployment. Start asking them to show receipts on their access controls, audit trails, and incident response.

For regulators watching this space, here's your case study. Self-regulation works until it doesn't. When "too dangerous for public release" and "available to randos on Discord" describe the same model on the same day, the argument for external oversight writes itself.

Sources

Daring Fireball | Mashable Tech

The Summary

The Signal

The Implication

Sources

Keep Reading