technique

Context distillation

Context distillation summarises an agent's growing conversation history into a compact representation, so each step's input stays small while preserving the relevant signal.

In agent loops the conversation history grows linearly with steps. Without compression, by step 20 the context is mostly stale tool outputs and the model both costs more and reasons worse ("lost in the middle" effects). Context distillation runs a small summariser after every N steps that rewrites the history as a tight scratchpad — current goal, key facts learned, open subtasks — and drops verbatim tool outputs that have been extracted from. The technique is standard in mature 2026 agent frameworks (LangGraph state graphs, OpenAI Agents SDK summarisation) and is the difference between agents that work at step 5 vs step 50.

When to use context distillation

Common mistakes

FAQ

What is context distillation?

Context distillation summarises an agent's growing conversation history into a compact representation, so each step's input stays small while preserving the relevant signal.

When should I use context distillation?

Long-horizon agent loops (10+ steps). Multi-turn assistants with hours-long sessions. RAG workflows with many retrieved-document turns.

What are the most common mistakes with context distillation?

Distilling too aggressively — losing context the next step needed. Skipping distillation until you hit the context limit — by then quality has already degraded.

Last updated: 2026-06-01. Raw markdown: https://promtable.com/glossary/context-distillation.md.