OpenAI's Best Model Just Failed Tests a 5-Year-Old Aces
Jensen Huang called AGI while the best models on Earth can't crack what a human child solves in seconds. ARC-AGI-3 benchmark launched the same week Nvidia's CEO declared artificial general intelligence achieved. Google's Gemini scored 0.37%. OpenAI's GPT-5.4 managed 0.26%. Humans score 100%.
Continue reading ›