LoRA (Low-Rank Adaptation)
LoRA is a fine-tuning method that trains a small set of low-rank adapter weights on top of a frozen base model — cheaper to train and store than full fine-tuning.
LoRA (Hu et al., 2021) inserts trainable rank-decomposition matrices into transformer layers while keeping the original weights frozen. The result: you can fine-tune a 70B-parameter model on a single GPU and store the adapter (a few MB) instead of a full checkpoint (140 GB). LoRA adapters can be hot-swapped at inference time, so one base model can serve many specialised tasks. QLoRA adds 4-bit quantisation, making fine-tuning a 70B model viable on a single 24 GB GPU. LoRA is the default fine-tuning technique in 2026 for open-weight LLMs and image diffusion models.
When to use lora (low-rank adaptation)
- Customising open-weight models on small datasets (500–10,000 examples).
- Training a character or art style on Stable Diffusion / Flux.
- Multi-tenant deployments where many adapters share one base.
Common mistakes
- Setting LoRA rank too low (under 4) — under-fits on complex tasks.
- Forgetting to merge LoRA into base weights for production latency-critical paths.
FAQ
What is lora (low-rank adaptation)?
LoRA is a fine-tuning method that trains a small set of low-rank adapter weights on top of a frozen base model — cheaper to train and store than full fine-tuning.
When should I use lora (low-rank adaptation)?
Customising open-weight models on small datasets (500–10,000 examples). Training a character or art style on Stable Diffusion / Flux. Multi-tenant deployments where many adapters share one base.
What are the most common mistakes with lora (low-rank adaptation)?
Setting LoRA rank too low (under 4) — under-fits on complex tasks. Forgetting to merge LoRA into base weights for production latency-critical paths.
Related terms
- Fine-tuning — Fine-tuning updates a pretrained model's weights on task-specific data, baking the new behaviour into the model rather than relying on prompts.
- Diffusion model — A diffusion model is a generative neural network that creates images, video, or audio by iteratively denoising random noise toward a learned target distribution.
- Embeddings — Embeddings are dense numeric vectors that represent the meaning of text, images, or other data, allowing similarity to be measured as vector distance.
Sources
Last updated: 2026-06-01. Raw markdown: https://promtable.com/glossary/lora.md.