Extended thinking
Extended thinking is Anthropic's flag on Claude that allocates a configurable budget of internal reasoning tokens before the user-visible answer — enabling deeper reasoning on hard problems for a higher cost.
Anthropic's equivalent of OpenAI's o-series reasoning. When you enable extended thinking with a token budget (e.g. 8,000 tokens), Claude runs an internal chain-of-thought that the user never sees before emitting the final answer. The technique materially improves performance on math, code, planning, and multi-step reasoning. By 2026 it's a per-call setting on Claude 4.x. Best practice: route by task — disable extended thinking for chat and extraction, enable with a 4K-8K budget for hard reasoning, escalate to 16K only when measured quality warrants the cost.
When to use extended thinking
- Math, code, planning, multi-step reasoning.
- Agent step decisions where one wrong step cascades.
Common mistakes
- Adding explicit "think step by step" instructions when extended thinking is enabled — often hurts quality.
- Running every query with extended thinking — costs and latency blow up.
FAQ
What is extended thinking?
Extended thinking is Anthropic's flag on Claude that allocates a configurable budget of internal reasoning tokens before the user-visible answer — enabling deeper reasoning on hard problems for a higher cost.
When should I use extended thinking?
Math, code, planning, multi-step reasoning. Agent step decisions where one wrong step cascades.
What are the most common mistakes with extended thinking?
Adding explicit "think step by step" instructions when extended thinking is enabled — often hurts quality. Running every query with extended thinking — costs and latency blow up.
Related terms
- Reasoning model — A reasoning model is an LLM trained to produce extensive internal chain-of-thought before its final answer, trading latency for higher accuracy on hard problems.
- Reasoning tokens — Reasoning tokens (or thinking tokens) are the internal chain-of-thought tokens reasoning models produce before the user-visible answer — billed separately and not shown to the end user.
- Chain-of-thought prompting — Chain-of-thought (CoT) prompting tells a language model to write its reasoning steps before its final answer, increasing accuracy on multi-step problems.
Last updated: 2026-06-01. Raw markdown: https://promtable.com/glossary/extended-thinking.md.