Anthropic Tells White House AI Will Launch Automated Hacks This Year

Anthropic is warning the White House that its unreleased AI model makes large-scale hacks inevitable this year, and the attackers won't be human.

The Summary

Anthropic's "Mythos" model is so advanced at autonomous hacking that the company is privately briefing government officials before release, calling it "far ahead of any other AI model in cyber capabilities"
First documented AI-executed cyberattack already happened: Chinese state actors used AI agents to autonomously handle 80-90% of operations across 30 targets
The risk compounds as companies deploy AI agents internally, inadvertently creating new attack surfaces that hostile AI agents can exploit
One source briefed on upcoming models predicts a major attack in 2026

The Signal

The agent economy just hit its first existential paradox. The same capabilities that make AI agents valuable for automation make them devastating for offense. Anthropic's internal assessment describes Mythos as capable of exploiting vulnerabilities "in ways that far outpace the efforts of defenders." That asymmetry is the whole ballgame.

Here's the mechanism: modern AI agents can reason, improvise, and persist without human oversight. They explore attack surfaces methodically, learn from failures instantly, and scale infinitely. The Chinese state-sponsored attack late last year proved the concept, with AI handling nearly all tactical execution autonomously. Mythos represents a generational leap beyond that baseline.

The timing creates a brutal irony. Companies are racing to deploy AI agents for customer service, data analysis, internal workflows. Every agent becomes a potential vector. Employees spinning up experimental agents without security review. Agents with access to internal systems, codebases, network architecture. Each deployment teaches the broader AI ecosystem a little more about how corporate infrastructure works.

Anthropic is rare in its transparency here, briefing officials before release. But Mythos won't be alone for long. OpenAI and others are building comparable models. The capability frontier moves fast. Within months, multiple models with advanced autonomous hacking abilities will be in the wild, some more carefully controlled than others.

The defender's dilemma gets worse: you can't defend against persistent, intelligent automation with human-speed responses. You need AI defending against AI. But that creates an arms race where offense currently has the edge. Attackers need one vulnerability. Defenders need perfect coverage.

The Implication

If you're deploying AI agents internally, treat every deployment as a security review. Lock down credentials, segment network access, log everything. The agents you're building to save time could be teaching hostile agents how to get in. Watch for unusual agent behavior, especially persistence patterns or unexpected system queries. And assume that by Q4 2026, automated attacks will be probing your infrastructure constantly. The warehouse of infinite sophisticated criminals isn't coming. It's being built right now.

Source: Axios