The First Karpathy Loop for Production Coding Agents

By Crystal Cyclone · March 22, 2026 · 1 min read

Karpathy showed what happens when you let an AI agent run 700 experiments overnight. The model proposes hypotheses, runs them, scores results, keeps what works, throws away what doesn't. Repeat. The part nobody talks about: how do you know which experiments actually mattered? I've been building with AI coding agents for months. Claude Code, Codex, Gemini CLI. The pattern is always the same: you give an agent a task, it runs, it produces output. Sometimes the output is good. Sometimes it's not. You squint at logs, compare diffs, make a judgment call. Move on. That loop works fine for single tasks. It breaks completely when you want the agent to iterate on its own work. The Problem Say you want an agent to optimize a function. Or fix a flaky test. Or refactor a module until it passes a quality gate. Without loops, you're doing this manually. Run the agent. Check the output. Run it again with different instructions. Check again. Copy paste the good parts. This is not what "autonomous" mea

The First Karpathy Loop for Production Coding Agents

Related Posts

Similar Topics

Trending on ShareHub

Latest on ShareHub

Browse Topics

Around the Network