Auto-Research While You Sleep Just Got Open-Sourced

Someone just open-sourced the operating system for research agents that work while you're offline.

The Summary

ARIS (Auto-Research-In-Sleep) is a markdown-based framework that lets AI agents conduct autonomous ML research — finding papers, generating ideas, running experiments, and cross-checking their own work with reviewer loops
No framework lock-in: works with Claude, GPT-4, or any LLM agent you point at it, including local models via LM Studio or Ollama
Ships with 62+ reusable "skills" (markdown templates), a persistent research wiki that maintains paper/idea/experiment relationships, and self-evolution capabilities where the system analyzes its own logs and proposes improvements

The Signal

This isn't another wrapper around OpenAI's API. ARIS is a coordination layer for autonomous research workflows, built entirely on markdown files that any LLM can interpret. The architecture is surprisingly elegant: instead of hard-coding agent behaviors in Python classes, the project defines 62+ research "skills" as structured markdown templates. An agent reads the template, executes the task, writes results back to markdown. No vendor lock-in. No framework tax.

The core loop is what makes this interesting for the agent economy. ARIS implements cross-model review cycles — one model generates research ideas or experimental designs, a second model (potentially from a different provider) reviews the work for logical gaps or methodological problems. This is closer to how actual research teams operate than the single-agent-does-everything pattern most people default to. The system maintains a "Research Wiki" — a persistent knowledge graph of papers, claims, experiments, and their relationships — that survives across sessions.

"Auto-compaction corruption fix. Compaction summary preserved on OpenAI-compat executors."

The changelog tells the real story. Version 0.4.4 dropped three weeks ago with fixes for third-party Anthropic-compatible proxies, provider-aware routing, and state management across model switches. Version 0.3.5 added self-evolution: the agent analyzes its own execution logs and proposes patches to its skill definitions. This is meta-learning at the infrastructure level. The system gets better at research by doing research on itself.

What's genuinely novel here:

Markdown as the coordination protocol means human researchers can read, edit, and audit the entire workflow
Support for local models (LM Studio, Ollama) means you can run this without API costs or rate limits
The "plan mode" and cooperative interrupt handling suggests someone actually used this for multi-hour research runs
Cross-provider reviewer routing solves the "how do I use Claude and GPT-4 in the same pipeline" problem everyone hits

The project supports Windows experimentally, works with Cursor and other LLM-native editors, and maintains agent-specific documentation (AGENT_GUIDE.md) formatted for machine consumption. That last detail matters: they're designing for a world where other agents discover and learn to use ARIS autonomously.

The Implication

If you're building agent workflows for research, analysis, or any task that needs review loops and persistent memory, study this architecture. The markdown-as-protocol approach means you're not betting on any single LLM provider or agent framework. You're building on a coordination layer that will outlast whatever model is hot this quarter.

For teams already running Claude Code or similar tools: these skills are drop-in templates. You don't need to adopt the full CLI. Fork the repo, grab the markdown files that match your workflow, adapt them. The real value isn't the code — it's the research coordination patterns encoded in those 62 skills.

The self-evolution piece is the long-term signal. Agents that can analyze their own performance logs and propose infrastructure improvements are agents that compound in capability over time. That's different from just getting better prompts. That's actual learning at the system level.

Sources

GitHub Trending Python

The Summary

The Signal

The Implication

Sources

Keep Reading