Your AI coding assistant isn't broken, you're just using it backwards.

The Signal

The Katana Quant piece that hit 435 points on HN makes a deceptively simple argument: LLMs generate better code when you write the tests first. Not groundbreaking on its face, but the implications cut deeper than TDD evangelism. The author's data shows that when developers define acceptance criteria before prompting, correctness rates jump from roughly 60% to 85%. More importantly, iteration cycles shrink. You stop playing whack-a-mole with edge cases because you defined what success looks like before the agent started generating.

This matters because most people treat AI coding tools like magic boxes. Describe what you want, hope for the best, debug when it breaks. That worked okay when you were the only one writing code. It falls apart when you're managing a stable of AI agents doing the actual implementation. The bottleneck isn't generation speed anymore, it's specification quality. If you can't articulate what correct looks like, you can't evaluate what the agent produces. You end up in an endless loop of "not quite right" because you're debugging toward a moving target you never properly defined.

The real insight here isn't about LLMs at all. It's about human cognition. Writing tests first forces you to think clearly about requirements before you get distracted by implementation details. When the agent is doing the implementation, that clarity becomes the entire job. The agents aren't getting smarter at understanding vague requirements. You're getting better at not being vague.

The Implication

Start treating prompt engineering like spec writing. Before you ask an AI to build something, write down how you'll know if it worked. Not in your head, on paper. Test cases, edge conditions, failure modes. The discipline of defining acceptance criteria first is the skill that separates people who use AI tools from people who actually ship with them.


Source: Hacker News Best