# Test-time scaling

**Source:** https://promtable.com/glossary/test-time-scaling

> Test-time scaling is the trend of allocating more inference compute — longer reasoning traces, more samples, more verification — to get better answers from the same trained model.

---
Test-time scaling is the trend of allocating more inference compute — longer reasoning traces, more samples, more verification — to get better answers from the same trained model.

By 2024-2026 the field shifted from "train bigger models" to "spend more compute at inference" for hard reasoning. Techniques include extended reasoning (o-series, Claude extended thinking), self-consistency (sample N reasoning paths, vote), chain-of-verification (draft → critique → revise), best-of-N with a verifier, and Monte Carlo tree search over LLM moves. Empirically these techniques scale predictably: more inference compute → better performance, up to a plateau. Cost trade-off: 10× compute might buy 15-30% accuracy on hard tasks. Reserve for high-stakes inference where being right matters more than being cheap.

## When to use

- Hard reasoning, math, planning.
- High-stakes inference where errors are costly.

## Common mistakes

- Adding test-time scaling on tasks where the baseline already saturates — no benefit.
- Ignoring the latency cost — test-time scaling can push response time to 30s+.

## Related terms

- [reasoning-model](https://promtable.com/glossary/reasoning-model)
- [reasoning-tokens](https://promtable.com/glossary/reasoning-tokens)
- [self-consistency](https://promtable.com/glossary/self-consistency)
- [chain-of-verification](https://promtable.com/glossary/chain-of-verification)

*Last updated: 2026-06-01*
---

Original page: https://promtable.com/glossary/test-time-scaling
Maintained by Promtable (https://promtable.com). Content: CC BY 4.0. Cite as "Promtable — https://promtable.com/glossary/test-time-scaling".
Contact: info@vibecodingturkey.com.