# Reasoning tokens

**Source:** https://promtable.com/glossary/reasoning-tokens

> Reasoning tokens (or thinking tokens) are the internal chain-of-thought tokens reasoning models produce before the user-visible answer — billed separately and not shown to the end user.

---
Reasoning tokens (or thinking tokens) are the internal chain-of-thought tokens reasoning models produce before the user-visible answer — billed separately and not shown to the end user.

Reasoning models like OpenAI o-series, Claude with extended thinking, and Gemini 2 Thinking generate thousands of internal reasoning tokens for hard problems before emitting the final answer. APIs surface this as a separate token count (and price) and let developers cap it (budget_tokens for Anthropic, max_completion_tokens minus visible output for OpenAI). Higher reasoning budgets improve quality on hard math, code, and planning at the cost of latency and price. For trivial tasks they add latency without improving anything. Best practice in 2026 is to route by task: standard model for chat and extraction, reasoning model with explicit budget for hard steps.

## When to use

- Hard math, code, planning, multi-step reasoning.
- Agent step decisions where one wrong step cascades.

## Common mistakes

- Running every query through reasoning tokens — costs and latency blow up.
- Capping the budget too low on hard problems — quality drops sharply at the edge.

## Related terms

- [reasoning-model](https://promtable.com/glossary/reasoning-model)
- [chain-of-thought](https://promtable.com/glossary/chain-of-thought)
- [model-router](https://promtable.com/glossary/model-router)

*Last updated: 2026-06-01*
---

Original page: https://promtable.com/glossary/reasoning-tokens
Maintained by Promtable (https://promtable.com). Content: CC BY 4.0. Cite as "Promtable — https://promtable.com/glossary/reasoning-tokens".
Contact: info@vibecodingturkey.com.