Reasoning model
A reasoning model is an LLM trained to produce extensive internal chain-of-thought before its final answer, trading latency for higher accuracy on hard problems.
Reasoning models — OpenAI o1/o3, Claude 3.5/4 with extended thinking, Gemini 2 Thinking, DeepSeek R1, Qwen QwQ — are post-trained to run long internal CoT (often thousands of tokens of "thinking") before emitting the user-facing answer. They dominate benchmarks on math, code, and multi-step reasoning, but are slower (5–30 seconds typical) and more expensive per query. For trivial tasks they are overkill and add latency. Best practice in 2026 is to route: fast non-reasoning model for chat and extraction, reasoning model for planning, hard math, complex code, and agent step-decisions.
When to use reasoning model
- Math, code, complex planning, multi-hop QA.
- Agent step-decisions where one wrong step cascades.
- Anything where 10 extra seconds is worth a 20-point accuracy gain.
When not to use reasoning model
- Real-time chat with strict <2 s latency budget.
- Trivial extraction or classification (waste of money).
Common mistakes
- Adding explicit CoT to a reasoning model — often hurts.
- Using a reasoning model for every API call instead of routing.
FAQ
What is reasoning model?
A reasoning model is an LLM trained to produce extensive internal chain-of-thought before its final answer, trading latency for higher accuracy on hard problems.
When should I use reasoning model?
Math, code, complex planning, multi-hop QA. Agent step-decisions where one wrong step cascades. Anything where 10 extra seconds is worth a 20-point accuracy gain.
What are the most common mistakes with reasoning model?
Adding explicit CoT to a reasoning model — often hurts. Using a reasoning model for every API call instead of routing.
Related terms
- Chain-of-thought prompting — Chain-of-thought (CoT) prompting tells a language model to write its reasoning steps before its final answer, increasing accuracy on multi-step problems.
- AI agent — An AI agent is a system where a language model autonomously plans and executes a sequence of tool calls to accomplish a goal.
- System prompt — A system prompt is the high-priority instruction block that defines a model's role, constraints, and default behaviors for an entire conversation.
Last updated: 2026-06-01. Raw markdown: https://promtable.com/glossary/reasoning-model.md.