University economists just proved that AI tutors work, but only if you build them to withhold answers.

The Signal

Two economists at the University of Wisconsin, La Crosse ran a real experiment with 140 undergrads in spring 2025. They built "Macro Buddy" using ChatGPT's custom GPT feature, trained it to guide students through reasoning instead of spitting out answers, and tested it against traditional study methods. The setup was clean: four sections of the same macroeconomics course, identical materials and exams, randomly assigned study formats after the first test. One group worked alone without AI. Another group worked together without AI. A third group got Macro Buddy plus peer discussion.

The students who combined Macro Buddy with peer work scored higher on exams than students working alone. Here's what matters: exams were in-person, no notes, no AI allowed during testing. Scores reflected what stuck in actual human brains, not what students could prompt out of a chatbot in real time.

This cuts against the panic narrative. Ninety percent of college students surveyed in 2025 already use generative AI for coursework. The question isn't whether they'll use it. The question is whether we can design AI tools that actually teach instead of just completing assignments. These researchers built their tutor with web access disabled, trained it specifically to ask questions back, and paired it with human collaboration. That design choice, not the AI itself, drove the learning gains.

The Implication

The architecture of AI tools determines whether they make humans smarter or lazier. If you're building education tech or training tools for the agent economy, the lesson is clear: constrain the AI to scaffold thinking, not replace it. The winning formula here wasn't AI alone. It was AI designed to withhold plus humans working together. Watch for more experimentation in this direction, especially in professional training where reasoning matters more than memorized answers.


Source: Fast Company Tech