Prompt tuning
Prompt tuning trains a small set of "soft prompt" tokens — continuous vectors that prepend to the model input — to specialise a frozen LLM for a task with minimal parameters.
Introduced by Lester et al. (2021), prompt tuning learns task-specific embedding vectors (not text) that get prepended to the model's input. Only the soft prompt parameters train; the rest of the model is frozen. Variants include prefix tuning (per-layer soft prompts) and P-tuning v2 (soft prompts at multiple layers). For narrow tasks with enough labelled data, prompt tuning can approach full fine-tuning quality at a tiny fraction of trainable parameters. Less popular in 2026 than LoRA for most use cases — LoRA is more flexible — but prompt tuning still wins on truly parameter-constrained deployments.
When to use prompt tuning
- Highly parameter-constrained deployments.
- Narrow tasks with sufficient labelled data.
Common mistakes
- Trying to learn complex multi-task behaviour with a small soft prompt — usually under-fits.
- Mixing soft prompts and few-shot examples without thinking about token budget impact.
FAQ
What is prompt tuning?
Prompt tuning trains a small set of "soft prompt" tokens — continuous vectors that prepend to the model input — to specialise a frozen LLM for a task with minimal parameters.
When should I use prompt tuning?
Highly parameter-constrained deployments. Narrow tasks with sufficient labelled data.
What are the most common mistakes with prompt tuning?
Trying to learn complex multi-task behaviour with a small soft prompt — usually under-fits. Mixing soft prompts and few-shot examples without thinking about token budget impact.
Related terms
- Fine-tuning — Fine-tuning updates a pretrained model's weights on task-specific data, baking the new behaviour into the model rather than relying on prompts.
- LoRA (Low-Rank Adaptation) — LoRA is a fine-tuning method that trains a small set of low-rank adapter weights on top of a frozen base model — cheaper to train and store than full fine-tuning.
- Few-shot prompting — Few-shot prompting supplies 2–10 input–output examples inside the prompt so the model imitates the pattern on a new input.
Sources
Last updated: 2026-06-01. Raw markdown: https://promtable.com/glossary/prompt-tuning.md.