DeepMind Veteran Raises $1.1B to Kill Human Training Data

The biggest AI funding round of 2026 just bet against the entire premise of ChatGPT.

The Summary

Ineffable Intelligence raised $1.1 billion to build AI systems using reinforcement learning instead of human-labeled training data
The company is founded by a Google DeepMind veteran who believes reinforcement learning, not large language models, is the path to superintelligence
This is a direct challenge to the LLM orthodoxy that has defined AI development since GPT-3

The Signal

The entire AI industry spent the last four years scraping the internet, hiring armies of contractors to label data, and building bigger transformer models. Ineffable Intelligence just raised over a billion dollars to do the opposite.

The bet is simple: human-labeled data is a bottleneck. LLMs learn by ingesting text written by humans, watching what humans click, reading human feedback. They are prediction machines trained on the corpus of human output. Reinforcement learning skips that step entirely. The AI learns by trial and error in an environment, getting rewards for good outcomes and penalties for bad ones. No human in the loop labeling "good" and "bad" examples.

"Reinforcement learning is the path to superintelligence, not large language models."

This is not a new idea. DeepMind used reinforcement learning to build AlphaGo, which beat the world champion at Go in 2016 by playing millions of games against itself. No human game records required after the initial bootstrap. AlphaZero taught itself chess, shogi, and Go to superhuman levels in hours, starting from nothing but the rules.

But games have clear rules and win conditions. The real world does not. The open question is whether you can build general intelligence with RL, or if you need the common sense and world knowledge that LLMs absorb from human text. Ineffable is betting you can. With $1.1 billion, they have runway to find out.

Key implications:

If RL works, the data moat disappears. No more scraping lawsuits, no more paying contractors $15/hour to label images.
If RL works, agents get better by doing, not by reading about doing. They improve through deployed experience, not corpus updates.
The companies winning at LLMs today have massive data advantages. This funding round is a bet that advantage is temporary.

The Implication

Watch where this capital flows. If Ineffable hires robotics engineers and builds simulation infrastructure, they are serious about RL in physical and digital environments. If they hire linguists and data labelers six months from now, the thesis bent to reality.

The timing matters. LLM scaling is hitting a wall. GPT-5 is not 10x better than GPT-4 the way GPT-4 was versus GPT-3. Compute costs are rising faster than capabilities. The industry is hunting for the next S-curve. Ineffable just raised enough money to build it, or die trying.

Sources

RWA Times | Decrypt

The Summary

The Signal

The Implication

Sources

Keep Reading