Anthropic built an AI so good at hacking that they won't release it, so Goldman Sachs is testing it instead.

The Summary

The Signal

This is the first time a frontier AI lab has publicly refused to release a model because it's too effective at offense. Anthropic isn't hiding Mythos in a vault. They're running a controlled experiment with 11 handpicked organizations, including Goldman Sachs, to stress-test enterprise defenses before attackers get their hands on the same capabilities. The implicit message: the threat is real, imminent, and sophisticated enough that even amateurs could punch through corporate infrastructure.

Goldman's CEO calling out cybersecurity as a "top priority" during an earnings call is not standard CEO boilerplate. Solomon specifically noted the bank is "hyperaware" of Mythos's capabilities and coordinating across Anthropic and its own security stack. That level of executive attention means the internal threat assessments are alarming. When a bank that processes trillions announces it's scrambling to defend against an AI that doesn't exist in the wild yet, you're watching the cyber arms race accelerate in real time.

"Cybersecurity has long been at the core of our business, and we have for a very, very long time, put enormous resources into it."

The timing matters. Last week, Solomon and other banking leaders met with top officials about the risk. That's not a routine vendor briefing. That's regulators, banks, and AI builders trying to coordinate a response to a capability shift they didn't see coming this fast. The fact that HackerOne's CEO is simultaneously saying "we're less safe now" suggests the defensive playbook hasn't caught up to what Mythos can already do.

Project Glasswing is a elegant branding for "we need guinea pigs with deep pockets and strong defenses to tell us what breaks first." The 11 organizations testing Mythos aren't getting early access as a perk. They're getting it because they have the resources to identify vulnerabilities, patch systems, and share intelligence before the model leaks or gets replicated. Anthropic is betting that controlled exposure beats a surprise zero-day from an adversary who built something similar in secret.

Key unknowns:

  • What specific vulnerabilities can Mythos exploit that earlier models couldn't?
  • How long before open-source alternatives replicate these capabilities?
  • Are the other 10 Glasswing participants also financial institutions, or did Anthropic spread access across critical infrastructure sectors?

The Implication

If you're running security for anything that touches the internet, this is your signal to stop thinking about AI as a productivity tool and start thinking about it as the next generation of attackers. The fact that Anthropic felt confident enough to build Mythos but scared enough to withhold it means the capability curve just went vertical. Project Glasswing buys maybe six months before these techniques show up in attacker toolkits, either through leaks, independent discovery, or open-source replication.

For companies building AI agents that touch sensitive systems, this is a design constraint you can't ignore. Every agent you deploy is a potential entry point. Every API call is a surface area. The old security model assumed humans were the weak link. The new model assumes your automated systems are talking to other automated systems, and some of them are adversarial by design.

Sources

Bloomberg Tech | Business Insider Tech