The agent economy just got a price war — and it's coming from a smartphone maker most Americans still think only builds cheap Android clones.
The Summary
- Xiaomi released MiMo-V2.5 and V2.5-Pro, both MIT-licensed open source models optimized for agentic "claw" tasks — the work of AI agents that actually go do things on your behalf across messaging apps and productivity tools.
- The Pro version achieves a 63.8% success rate on claw benchmarks while using ~70K tokens per task, consuming 40-60% fewer tokens than Claude Opus 4.6, Gemini 3.1 Pro, and OpenAI equivalents.
- This matters because usage-based billing is replacing rate limits and flat subscriptions — efficiency is now a competitive advantage, not just a technical virtue.
The Signal
Xiaomi isn't playing the model size game. They're playing the efficiency game. And that's the more interesting bet for anyone building production agent systems in 2026.
The MiMo models are purpose-built for what the industry calls "claw tasks" — agents that integrate with platforms like OpenClaw, NanoClaw, and Hermes Agent to automate work humans used to do manually. Marketing content generation, email triage, scheduling, account management. The boring, expensive, repetitive stuff that eats 60% of knowledge worker time.
"The Pro model leads the open-source field with a 63.8% success rate, consuming only ~70K tokens per trajectory."
What makes this release notable isn't the accuracy — 63.8% is good but not groundbreaking. It's the token efficiency. In absolute terms, MiMo-V2.5-Pro uses 40-60% fewer tokens than frontier closed models from Anthropic, Google, and OpenAI to achieve comparable task completion. That math changes fast when you're running thousands of agent tasks per day.
Here's why that matters right now:
- GitHub Copilot just moved to usage-based billing
- Microsoft is charging per-token for most agent-adjacent services
- Anthropic and OpenAI are testing metered pricing in enterprise deals
- Every major cloud provider is shifting AI workloads from flat subscription to consumption models
The all-you-can-eat buffet is closing. If your agent architecture was designed for "free" inference during the ChatGPT era, your unit economics are about to get ugly. Token efficiency is the new moat.
Xiaomi is also making a licensing play that matters. MIT License means enterprises can modify, fork, and deploy these models in commercial production without royalties, usage restrictions, or the legal ambiguity that surrounds some "open" model licenses. You can run MiMo locally, on your VPC, or fine-tune it for proprietary workflows. No phone calls to legal.
The company's benchmark, ClawEval, positions both models in the top-left quadrant: high task success, low token consumption. That's the sweet spot for production deployment. It's also a direct shot at the closed model vendors who've been treating agentic workloads as a premium SKU.
The Implication
If you're building agent infrastructure, you now have a credible open source alternative that won't bankrupt you under usage-based pricing. Test MiMo-V2.5-Pro against your current stack. If token efficiency is within 10% and accuracy is acceptable, the cost delta might justify the switch — especially for high-volume, low-stakes tasks.
For the hyperscalers, this is a warning shot. The Chinese hardware companies are coming for the agent layer with models that are good enough, cheap enough, and open enough to win enterprise workloads where "best" doesn't matter as much as "cost-effective and controllable."
The agent economy scales on margins, not benchmarks. Xiaomi just made the margins better.