Hallucination
A hallucination is when a language model produces output that is factually wrong, fabricated, or unsupported, while sounding confident.
Hallucination is the most discussed failure mode of LLMs. It happens because language models predict plausible next tokens, not verified facts — when ground-truth knowledge is missing or contradictory, the model fills the gap with statistically likely but false content. Common forms include invented citations, fake API signatures, wrong historical dates, and confident answers to questions outside the model's knowledge cutoff. Mitigations include retrieval-augmented generation (RAG), explicit uncertainty prompting ("If unsure say 'I don't know'."), output verification with a second pass, and lower temperature for factual queries.
Common mistakes
- Assuming bigger models hallucinate less — they often hallucinate more confidently.
- Treating hallucination as a prompt-only problem when retrieval is the real fix.
- Not running factual evals before shipping.
FAQ
What is hallucination?
A hallucination is when a language model produces output that is factually wrong, fabricated, or unsupported, while sounding confident.
What are the most common mistakes with hallucination?
Assuming bigger models hallucinate less — they often hallucinate more confidently. Treating hallucination as a prompt-only problem when retrieval is the real fix. Not running factual evals before shipping.
Related terms
- Retrieval-augmented generation (RAG) — Retrieval-augmented generation (RAG) injects relevant documents into the prompt at query time so the model answers from your data instead of its training memory.
- Grounding — Grounding is any technique that ties a language model's output to verifiable sources — retrieved documents, tool results, structured data — instead of pure memory.
- Temperature — Temperature is a sampling parameter that controls randomness in a language model's output, where 0 is fully deterministic and higher values introduce more variety.
- Chain-of-thought prompting — Chain-of-thought (CoT) prompting tells a language model to write its reasoning steps before its final answer, increasing accuracy on multi-step problems.
Sources
Last updated: 2026-06-01. Raw markdown: https://promtable.com/glossary/hallucination.md.