technique

Instruction tuning

Instruction tuning is the post-training stage where a base language model is fine-tuned on examples of (instruction, ideal response) pairs to follow human instructions reliably.

Base LLMs (raw next-token predictors) do not follow instructions well — they continue text in style. Instruction tuning (introduced widely by InstructGPT and FLAN in 2021-2022) reshapes the model to follow imperative inputs like "Translate this to French". RLHF (reinforcement learning from human feedback) and DPO (direct preference optimisation) are the dominant techniques. In 2026 the discipline has matured: most open-weight model releases ship a base + instruction-tuned pair, and serious fine-tuning teams pick up where instruction tuning left off and add domain-specific or persona-specific alignment.

Common mistakes

FAQ

What is instruction tuning?

Instruction tuning is the post-training stage where a base language model is fine-tuned on examples of (instruction, ideal response) pairs to follow human instructions reliably.

What are the most common mistakes with instruction tuning?

Trying to teach a base model new tasks via prompting alone — it won't follow consistently. Re-fine-tuning an already instruction-tuned model on raw text — degrades instruction following.

Last updated: 2026-06-01. Raw markdown: https://promtable.com/glossary/instruction-tuning.md.