Chain-of-thought prompting
Chain-of-thought (CoT) prompting tells a language model to write its reasoning steps before its final answer, increasing accuracy on multi-step problems.
Chain-of-thought (CoT) is a prompting technique introduced by Wei et al. (2022) where the model is instructed to expose intermediate reasoning before delivering a final answer. The two most common forms are zero-shot CoT ("Let's think step by step.") and few-shot CoT (showing 2–5 worked examples in the prompt). CoT measurably improves performance on arithmetic, commonsense reasoning, and symbolic tasks, but it costs extra output tokens and slows latency. Modern reasoning models like o1, Claude with extended thinking, and Gemini 2 Thinking effectively run CoT internally, so manual CoT is most valuable on smaller or non-reasoning models.
When to use chain-of-thought prompting
- Math, logic, and multi-step planning prompts.
- Smaller models (8B–70B) where reasoning capability is weaker.
- When you can afford 2–10× output tokens.
When not to use chain-of-thought prompting
- Trivial extraction or single-step classification (CoT adds cost with no quality gain).
- Real-time UX with strict latency budgets.
- Reasoning models that already think internally — explicit CoT can hurt.
Example
Input: If a shirt is $20 and on 25% off, what's the final price? Think step by step. Output: Step 1: 25% of $20 = $5. Step 2: $20 - $5 = $15. Final price: $15.
Common mistakes
- Using CoT on reasoning models — it often degrades quality.
- Forgetting to mark the final answer (e.g. 'Answer:') so it's parseable.
- Pairing CoT with high temperature, which makes reasoning incoherent.
FAQ
What is chain-of-thought prompting?
Chain-of-thought (CoT) prompting tells a language model to write its reasoning steps before its final answer, increasing accuracy on multi-step problems.
When should I use chain-of-thought prompting?
Math, logic, and multi-step planning prompts. Smaller models (8B–70B) where reasoning capability is weaker. When you can afford 2–10× output tokens.
What are the most common mistakes with chain-of-thought prompting?
Using CoT on reasoning models — it often degrades quality. Forgetting to mark the final answer (e.g. 'Answer:') so it's parseable. Pairing CoT with high temperature, which makes reasoning incoherent.
Related terms
- Few-shot prompting — Few-shot prompting supplies 2–10 input–output examples inside the prompt so the model imitates the pattern on a new input.
- Zero-shot prompting — Zero-shot prompting asks the model to perform a task with no examples — only the instruction and the input.
- Self-consistency — Self-consistency runs the same prompt multiple times at non-zero temperature and picks the most common final answer, raising accuracy on reasoning tasks.
- Reasoning model — A reasoning model is an LLM trained to produce extensive internal chain-of-thought before its final answer, trading latency for higher accuracy on hard problems.
Sources
Last updated: 2026-06-01. Raw markdown: https://promtable.com/glossary/chain-of-thought.md.