technique

Prompt tuning

Prompt tuning trains a small set of "soft prompt" tokens — continuous vectors that prepend to the model input — to specialise a frozen LLM for a task with minimal parameters.

Introduced by Lester et al. (2021), prompt tuning learns task-specific embedding vectors (not text) that get prepended to the model's input. Only the soft prompt parameters train; the rest of the model is frozen. Variants include prefix tuning (per-layer soft prompts) and P-tuning v2 (soft prompts at multiple layers). For narrow tasks with enough labelled data, prompt tuning can approach full fine-tuning quality at a tiny fraction of trainable parameters. Less popular in 2026 than LoRA for most use cases — LoRA is more flexible — but prompt tuning still wins on truly parameter-constrained deployments.

When to use prompt tuning

Common mistakes

FAQ

What is prompt tuning?

Prompt tuning trains a small set of "soft prompt" tokens — continuous vectors that prepend to the model input — to specialise a frozen LLM for a task with minimal parameters.

When should I use prompt tuning?

Highly parameter-constrained deployments. Narrow tasks with sufficient labelled data.

What are the most common mistakes with prompt tuning?

Trying to learn complex multi-task behaviour with a small soft prompt — usually under-fits. Mixing soft prompts and few-shot examples without thinking about token budget impact.

Sources

Last updated: 2026-06-01. Raw markdown: https://promtable.com/glossary/prompt-tuning.md.