The government is pushing Wall Street to test an AI that finds zero-day exploits while the Pentagon calls the same company a supply-chain risk.

The Summary

The Signal

This is a mess that reveals how unprepared institutions are for AI agents that can actually break things. Wall Street banks began internal testing of Mythos after Trump officials encouraged them to use it for vulnerability detection. At the same time, the Pentagon flagged Anthropic as a security risk. No one seems to have coordinated these positions before banks started plugging an AI into their infrastructure.

The technical claims are already under fire. Jaya Baloo, COO and CISO of cybersecurity firm Aisle, told Bloomberg Tech that their testing shows cheap open-source models can identify the same bug Anthropic has been showcasing. If true, this deflates the "hacker's superweapon" framing. It suggests Mythos might be good AI engineering packaged with extraordinary marketing, not a paradigm shift in offensive security capabilities.

"Cheap open-source models can find the same bug Anthropic has highlighted."

But even if Mythos is overhyped, the premise matters. Wired calls its arrival a wake-up call for developers who have treated security as an afterthought. That part is real. AI agents designed to probe for vulnerabilities will force a reckoning, whether or not Anthropic has the best one. The question is whether organizations will respond by hardening code or by racing to acquire their own offensive AI before competitors do.

The government's contradictory stance exposes deeper confusion:

  • Treasury wants Mythos access to find flaws in federal systems
  • The Pentagon warns against Anthropic as a supply-chain threat
  • Trump officials simultaneously encourage banks to test the model

This isn't strategy. It's three agencies discovering they have no shared framework for evaluating AI risk when the AI can actually do something dangerous. The incentive structure is backwards. If you're a bank, do you test Mythos because Treasury might mandate it later? Do you avoid it because Defense says it's risky? Do you assume everyone else is testing it quietly, so you'd better get access before they find your bugs first?

The answer for most institutions will be to test quietly and say nothing publicly. That means a small number of organizations will gain months of lead time learning what Mythos-class models can find, while everyone else operates blind. Treasury's move to seek access suggests the government knows this and wants in before the window closes.

The Implication

Watch which banks go quiet about their security posture over the next six months. If Mythos or similar models actually work, institutions that tested early will be patching in silence. The ones that didn't will be vulnerable and won't know it until someone else's agent finds the hole.

For AI builders, the lesson is simpler. Security researchers have been warning about AI-assisted exploit discovery for years. If Anthropic delivered it and the market response is "cool, but open-source does this too," then the moat isn't the model. It's access, speed, and the willingness to ship something governments simultaneously fear and demand. That's a different business than building the best AI.

Sources

Bloomberg Tech | TechCrunch AI | Wired AI