Chain-of-verification
Chain-of-verification (CoVe) is a prompting technique where the model first drafts an answer, then generates verification questions for each claim, answers them independently, and revises the draft accordingly.
Chain-of-verification (Dhuliawala et al., 2023) measurably reduces hallucinations on factual generation tasks. The four-step recipe: (1) generate a baseline draft, (2) plan verification questions targeting each claim, (3) answer the verification questions independently from the draft, (4) revise the draft using the verification answers. The technique costs roughly 2-4× tokens vs single-shot but recovers most of the reliability lost when the draft was confidently wrong. CoVe ships under the hood in some 2026 production agent stacks and is worth using whenever factual accuracy matters more than latency.
When to use chain-of-verification
- Factual generation where wrong answers are costly.
- Long-form content with multiple claims.
- Background tasks where 2-4× token cost is acceptable.
Common mistakes
- Letting the verification step reuse the original draft — the model rationalises its first answer.
- Skipping the independent re-answer — the verification becomes performative.
FAQ
What is chain-of-verification?
Chain-of-verification (CoVe) is a prompting technique where the model first drafts an answer, then generates verification questions for each claim, answers them independently, and revises the draft accordingly.
When should I use chain-of-verification?
Factual generation where wrong answers are costly. Long-form content with multiple claims. Background tasks where 2-4× token cost is acceptable.
What are the most common mistakes with chain-of-verification?
Letting the verification step reuse the original draft — the model rationalises its first answer. Skipping the independent re-answer — the verification becomes performative.
Related terms
- Chain-of-thought prompting — Chain-of-thought (CoT) prompting tells a language model to write its reasoning steps before its final answer, increasing accuracy on multi-step problems.
- Self-consistency — Self-consistency runs the same prompt multiple times at non-zero temperature and picks the most common final answer, raising accuracy on reasoning tasks.
- Hallucination — A hallucination is when a language model produces output that is factually wrong, fabricated, or unsupported, while sounding confident.
- Grounding — Grounding is any technique that ties a language model's output to verifiable sources — retrieved documents, tool results, structured data — instead of pure memory.
Sources
Last updated: 2026-06-01. Raw markdown: https://promtable.com/glossary/chain-of-verification.md.