Academia's most important preprint server just drew a line: use AI to think, not to fake it, or lose your publishing privileges for a year.

The Summary

The Signal

ArXiv isn't fighting AI. It's fighting lazy researchers who treat language models like autopilot and submit papers they never actually read. The new policy, announced by computer science section chair Thomas Dietterich, targets submissions with hallucinated citations, placeholder text like "As an AI language model, I cannot..." still embedded in the draft, or references to papers that don't exist.

This matters because ArXiv is infrastructure. It hosts over 2.4 million preprints and serves as the primary distribution channel for bleeding-edge research in physics, math, CS, and increasingly, AI itself. When researchers cite "arXiv:2401.12345" in their work, they're pointing to this repository. If it fills with unvetted LLM output, the citation graph of science becomes polluted.

"The issue isn't using LLMs, it's publishing without reading what they wrote."

The year-long ban is blunt but calibrated. Authors who violate the policy won't just wait it out. They'll need to prove they can clear peer review at a legitimate venue before ArXiv will host their work again. That's a reputational hit in fields where ArXiv preprints often circulate for months before journal publication, shaping research directions in real time.

ArXiv's move follows growing concern about "careless use of large language models" in academic writing. The tells are obvious when you know what to look for:

  • Citations to papers that were never written
  • Suspiciously smooth prose that suddenly includes phrases like "I cannot verify" mid-paragraph
  • Reference lists with perfectly formatted but entirely fictional DOIs

What's interesting is what this policy doesn't ban. Using an LLM to draft sections, polish grammar, or generate code examples remains fine. The line is verification. If you use AI to write and then publish without checking whether it hallucinated your bibliography, you're out.

The Implication

This is a template for how institutions will handle AI in knowledge work. The question isn't "did you use AI?" but "did you take responsibility for the output?" ArXiv's enforcement mechanism, requiring peer review re-entry, creates a filter that keeps the repository credible without rejecting AI tools outright.

For researchers, the message is clear: AI is a drafting partner, not a ghost writer. For platforms managing user-generated content at scale, watch this model. Year-long bans plus a credibility checkpoint for return could work anywhere quality matters more than volume. The slop wars are just starting. ArXiv just showed how to fight them without banning the tools.

Sources

TechCrunch AI | The Verge AI