Autonomous coder
An autonomous coder is an LLM agent that accepts a high-level task (a ticket, an issue, a feature request) and produces a working PR without step-by-step human guidance — Devin, OpenHands, Sweep, SWE-Agent, Claude Code's agent mode are 2026 examples.
The autonomous-coder category is what most people meant by 'AI software engineer' headlines in 2024-2026. The agent: reads the task, explores the repo, plans steps, edits files, runs tests, iterates until tests pass, opens a PR. Production-bar quality requires: strong [[swe-bench]] performance, durable sandbox execution ([[agent-sandbox]]), reliable plan-execute loop, cost caps, observable traces. Production gotchas: dependency-update / typo-fix tickets work; cross-cutting refactors and ambiguous bug reports fail; design-judgment tickets are wasted. The right pattern in 2026: autonomous on small / well-scoped / test-covered tickets, human-in-the-loop ([[approval-workflow]]) on bigger work.
When to use autonomous coder
- Well-scoped tickets with strong tests.
- Backlog burn-down on small bugs.
Common mistakes
- Queueing ambiguous tickets — agent thrashes for hours, produces a bad PR.
- Running without sandbox / cost cap — one prompt-injected agent burns hundreds of dollars.
FAQ
What is autonomous coder?
An autonomous coder is an LLM agent that accepts a high-level task (a ticket, an issue, a feature request) and produces a working PR without step-by-step human guidance — Devin, OpenHands, Sweep, SWE-Agent, Claude Code's agent mode are 2026 examples.
When should I use autonomous coder?
Well-scoped tickets with strong tests. Backlog burn-down on small bugs.
What are the most common mistakes with autonomous coder?
Queueing ambiguous tickets — agent thrashes for hours, produces a bad PR. Running without sandbox / cost cap — one prompt-injected agent burns hundreds of dollars.
Related terms
- Background agent — A background agent is an LLM-driven worker that runs asynchronously — receives a task, executes for minutes/hours without a user attached, posts results when done. Cursor's Background Agents, Claude Code's async tasks, Devin are 2026 examples.
- SWE-bench — SWE-bench is the standard benchmark for autonomous coding agents — real GitHub issues from popular Python repos paired with the actual fix commit; the agent must produce a patch that passes the hidden test suite.
- Agent sandbox — An agent sandbox is the isolated execution environment where an LLM-driven agent runs code, browses, or controls a desktop — the safety boundary that contains prompt-injection blast radius.
Last updated: 2026-06-01. Raw markdown: https://promtable.com/glossary/autonomous-coder.md.