Cheap-tier model
A cheap-tier model is the small-fast LLM each major provider ships alongside their frontier model — Claude Haiku, GPT-4o-mini, Gemini Flash, Mistral Small, DeepSeek V3 — used for routing, classification, extraction, and bulk inference.
Every major LLM provider in 2026 maintains a cheap-tier model 5-20× cheaper than the frontier tier. Use cases: router LLM that picks the next tool / model, classification + extraction at scale, real-time voice agents where latency matters, bulk content moderation. Quality is materially below frontier on hard reasoning but adequate for narrow tasks. Cost matters: production stacks routing 80% of traffic to cheap-tier and 20% to frontier can cut total cost 5-10× with minimal quality loss when the routing is evaluated and tuned.
When to use cheap-tier model
- Router LLMs.
- Classification + extraction at scale.
- Real-time voice + latency-critical apps.
Common mistakes
- Routing too much to cheap tier without evals — quality drift goes unnoticed.
- Using cheap tier for hard reasoning — frontier still wins by a wide margin.
FAQ
What is cheap-tier model?
A cheap-tier model is the small-fast LLM each major provider ships alongside their frontier model — Claude Haiku, GPT-4o-mini, Gemini Flash, Mistral Small, DeepSeek V3 — used for routing, classification, extraction, and bulk inference.
When should I use cheap-tier model?
Router LLMs. Classification + extraction at scale. Real-time voice + latency-critical apps.
What are the most common mistakes with cheap-tier model?
Routing too much to cheap tier without evals — quality drift goes unnoticed. Using cheap tier for hard reasoning — frontier still wins by a wide margin.
Related terms
- Model router — A model router picks which language model handles each request based on cost, latency, or task type — the standard production pattern in 2026.
- Router LLM — A router LLM is a small fast language model whose only job is to classify or rewrite an incoming request — deciding which downstream model, agent, or tool should handle it.
- Model router policy — A model router policy is the rule set that decides which model handles each request — usually as a chain of conditions (intent, latency budget, cost ceiling, quality required) over the available model set.
Last updated: 2026-06-01. Raw markdown: https://promtable.com/glossary/cheap-tier-model.md.