Durable execution
Durable execution is the workflow / agent pattern where long-running multi-step processes survive crashes, restarts, and queue failures — state checkpointed after each step, resume from last checkpoint. Inngest, Temporal, Restate, AWS Step Functions are 2026 implementations.
Naive long-running scripts die when the process restarts: a queue worker crash mid-step means starting over. Durable execution engines flip this: every step's input + output is checkpointed; on crash the engine resumes from the last checkpoint, no work lost. Critical for: multi-hour agents, multi-day onboarding flows, payment processing with retries, AI [[background-agent]]s that take 10+ minutes. Implementations: code looks linear (`await emailUser(); await waitForResponse(); await chargeCard()`) but the engine intercepts each step. Trade-offs: cognitive complexity (workers are short-lived but logic looks long-lived), debugging requires understanding the checkpoint log, harder to reason about side effects.
When to use durable execution
- Long-running agents / workflows that must survive restarts.
- Payment / regulatory flows where data loss is unacceptable.
Common mistakes
- Side effects without idempotency keys — restart re-runs charges / emails.
- Storing state in process memory — defeats the durability promise.
FAQ
What is durable execution?
Durable execution is the workflow / agent pattern where long-running multi-step processes survive crashes, restarts, and queue failures — state checkpointed after each step, resume from last checkpoint. Inngest, Temporal, Restate, AWS Step Functions are 2026 implementations.
When should I use durable execution?
Long-running agents / workflows that must survive restarts. Payment / regulatory flows where data loss is unacceptable.
What are the most common mistakes with durable execution?
Side effects without idempotency keys — restart re-runs charges / emails. Storing state in process memory — defeats the durability promise.
Related terms
- Workflow engine — A workflow engine is the orchestration runtime — n8n, Make.com, Zapier, Temporal, Airflow — that executes multi-step business processes, handles retries, manages state, and integrates with external systems.
- Background agent — A background agent is an LLM-driven worker that runs asynchronously — receives a task, executes for minutes/hours without a user attached, posts results when done. Cursor's Background Agents, Claude Code's async tasks, Devin are 2026 examples.
- Event-driven agent — An event-driven agent is an LLM agent triggered by external events (webhook, queue, schedule, system signal) rather than direct chat — handles tickets, monitors logs, sends reminders, runs ETL with reasoning.
Last updated: 2026-06-01. Raw markdown: https://promtable.com/glossary/durable-execution.md.