Cheatsheet

LangGraph agent cheatsheet (2026 production patterns)

Production LangGraph patterns: state machine design, node types, persistent checkpointing, human-in-the-loop, sub-agents, tracing, eval integration, and the antipatterns to skip.

State machine basics

LangGraph models agents as graphs of state transitions.

ItemDescriptionExample
StateTypedDict / Pydantic model holding everything the graph needs to pass between nodes.
NodesFunctions that take state, return state. Can be LLM calls, tool calls, conditionals, or arbitrary Python.
EdgesTransitions between nodes. Static, conditional (branching), or interrupt-driven.
START / ENDSpecial nodes marking graph entry and termination.

Node types

ItemDescriptionExample
LLM callStandard inference — chat completion, structured output, function calling.
Tool callInvoke a tool, fold result into state.
Conditional branchFunction returning the next node name based on state.
Sub-graphAn entire LangGraph as a node — composing larger agents.

Checkpointing

Production graphs persist state across turns.

ItemDescriptionExample
MemorySaverIn-memory checkpointer for dev.
PostgresSaver / SqliteSaverPersistent checkpointer for production. Stores graph state per thread_id.
Thread IDsEach conversation / session has a thread_id — checkpoints map to it.
ReplayRestart from any checkpoint — debug or branch from a specific state.

Human-in-the-loop

ItemDescriptionExample
interrupt_before / interrupt_afterPause execution before / after specific nodes for human approval.
Resume with state editHuman can edit state mid-pause before resuming. Useful for correction loops.
Time travelRoll back to a previous checkpoint and try a different branch.

Sub-agents + handoffs

ItemDescriptionExample
create_react_agentReady-made ReAct agent for many use cases. Wrap as a node in a larger graph.
Supervisor + workersCoordinator agent routes to specialist sub-agents based on intent.
HandoffExplicit node that transfers control to another agent's sub-graph.

Observability

ItemDescriptionExample
LangSmithFirst-party tracing — every node, every LLM call, every state transition logged.
OpenTelemetryCustom tracing for OTel-native shops.
EvalsPair traces with evals so regressions on a node are caught.

Antipatterns

ItemDescriptionExample
Mega nodeA single LLM-call node doing too much. Split into smaller focused nodes.
No checkpointProduction graphs without checkpoints can't recover from failure mid-run.
No max-step capLoops can run away. Always cap step count.

FAQ

Best LangGraph pattern for production?

Planner-executor split, postgres checkpointer, human-in-the-loop on destructive actions, LangSmith tracing.

How do I prevent runaway loops?

Set max step count, use no-progress detection, kill on token budget exceedance.

Is LangGraph too heavy for simple use cases?

Probably — for single-step LLM calls a plain SDK is fine. LangGraph wins past 3-5 step orchestration.

Last updated: 2026-06-01.