model

Reasoning model

A reasoning model is an LLM trained to produce extensive internal chain-of-thought before its final answer, trading latency for higher accuracy on hard problems.

Reasoning models — OpenAI o1/o3, Claude 3.5/4 with extended thinking, Gemini 2 Thinking, DeepSeek R1, Qwen QwQ — are post-trained to run long internal CoT (often thousands of tokens of "thinking") before emitting the user-facing answer. They dominate benchmarks on math, code, and multi-step reasoning, but are slower (5–30 seconds typical) and more expensive per query. For trivial tasks they are overkill and add latency. Best practice in 2026 is to route: fast non-reasoning model for chat and extraction, reasoning model for planning, hard math, complex code, and agent step-decisions.

When to use reasoning model

When not to use reasoning model

Common mistakes

FAQ

What is reasoning model?

A reasoning model is an LLM trained to produce extensive internal chain-of-thought before its final answer, trading latency for higher accuracy on hard problems.

When should I use reasoning model?

Math, code, complex planning, multi-hop QA. Agent step-decisions where one wrong step cascades. Anything where 10 extra seconds is worth a 20-point accuracy gain.

What are the most common mistakes with reasoning model?

Adding explicit CoT to a reasoning model — often hurts. Using a reasoning model for every API call instead of routing.

Last updated: 2026-06-01. Raw markdown: https://promtable.com/glossary/reasoning-model.md.