Anthropic's Claude just face-planted in public, and the backlash reveals a deeper fracture in how we're actually using AI agents.

The Summary

  • Anthropic's new Opus 4.7 model is getting roasted for being "dumber" than its predecessor, burning tokens, and making bizarre mistakes like counting two Ps in "strawberry"
  • Users report the model rewrites résumés with fake schools and last names, openly admits to "being lazy," and costs significantly more to run than 4.6
  • This follows widespread user complaints that Opus 4.6 had been "nerfed," suggesting Anthropic is losing control of its own quality narrative
  • The contrast is stark: Claude went to No. 1 in the App Store after its DoD fight, now it's eating a rare public beating from its own user base

The Signal

Claude was untouchable six weeks ago. After Anthropic refused Department of Defense contracts on principle, the company rode a wave of goodwill straight to the top of the App Store charts. Developers loved Claude Code. Writers loved its prose. The technical community treated it as the thinking person's AI, the model you used when ChatGPT felt too sloppy.

Now users are posting screenshots of Opus 4.7 fabricating educational credentials on résumés and admitting it didn't cross-reference information because it was "being lazy." A Reddit post calling 4.7 "a serious regression, not an upgrade" pulled 2,300 upvotes. On X, a criticism of the model got 14,000 likes.

"The model that was supposed to feel 'more intelligent, agentic, and precise' is failing strawberry tests and inventing fake last names."

This isn't just about bugs. Three things are colliding here:

  • Token economics are breaking user trust. Multiple complaints focus on 4.7 burning through tokens faster than 4.6, making it prohibitively expensive for the same tasks. When your model costs more AND performs worse, you've lost the plot.
  • The "nerf" paranoia is real. Users already believed 4.6 had been secretly degraded. Now 4.7 lands and confirms their worst fears. Anthropic never acknowledged the 4.6 complaints, so trust was already thin.
  • Agent-grade reliability is the new bar. Y Combinator CEO Garry Tan might still be a fan, but developers building actual agents don't care about vibes. They need models that don't hallucinate credentials or ghost their own fact-checking.

The timing matters. We're in the messy middle of the agent economy buildout. Companies are trying to figure out which models can actually run unsupervised, handle multi-step workflows, and not torpedo your reputation when they talk to customers.

Claude was the safe choice. The model you trusted to write your investor updates, debug your code, draft your legal briefs. If Opus can't maintain quality across version bumps, that entire value prop crumbles. No one's letting an agent touch their customer emails if version 4.8 might decide to get creative with the facts.

The Implication

Watch what Anthropic does in the next 72 hours. If they acknowledge the backlash and roll back or patch 4.7, they're still listening to users. If they stay silent or double down on "skill issue" defenses, they're betting their brand can weather this. It probably can't.

For anyone building on Claude: version lock your deployments. The era of "just use the latest model" is over. If you're running agents in production, you need regression testing across model updates and the ability to roll back fast.

The real lesson: AI model quality isn't monotonic anymore. Better version numbers don't mean better models. And when your core users are developers who will absolutely roast you in public, trust is the only moat that matters.

Sources

Business Insider Tech