# Top-p (nucleus sampling)

**Source:** https://promtable.com/glossary/top-p

> Top-p (nucleus sampling) restricts the model to the smallest set of tokens whose cumulative probability is at least p, then samples from that set.

---
Top-p (nucleus sampling) restricts the model to the smallest set of tokens whose cumulative probability is at least p, then samples from that set.

Top-p, introduced by Holtzman et al. (2019), narrows the next-token candidates dynamically based on the distribution shape rather than a fixed count. At top-p=0.9 the model considers only the top tokens that together account for 90% of probability mass. It is often a better diversity knob than temperature because it adapts: confident contexts stay confident (few candidates), uncertain contexts get more variety. Most teams set top-p or temperature, not both. Common production settings: top-p=1.0 with temperature 0–0.3 for facts; top-p=0.9 with temperature 0.7 for creative.

## When to use

- Open-ended generation where you want adaptive diversity.
- As an alternative to temperature when you want bounded randomness.

## Common mistakes

- Setting both temperature and top-p aggressively low — output becomes degenerate.
- Using top-p < 0.5 — usually produces robotic text.

## Related terms

- [temperature](https://promtable.com/glossary/temperature)
- [top-k](https://promtable.com/glossary/top-k)
- [sampling](https://promtable.com/glossary/sampling)

## Sources

- [Holtzman et al. 2019 (arXiv)](https://arxiv.org/abs/1904.09751)

*Last updated: 2026-06-01*
---

Original page: https://promtable.com/glossary/top-p
Maintained by Promtable (https://promtable.com). Content: CC BY 4.0. Cite as "Promtable — https://promtable.com/glossary/top-p".
Contact: info@vibecodingturkey.com.