concept

Cheap-tier model

A cheap-tier model is the small-fast LLM each major provider ships alongside their frontier model — Claude Haiku, GPT-4o-mini, Gemini Flash, Mistral Small, DeepSeek V3 — used for routing, classification, extraction, and bulk inference.

Every major LLM provider in 2026 maintains a cheap-tier model 5-20× cheaper than the frontier tier. Use cases: router LLM that picks the next tool / model, classification + extraction at scale, real-time voice agents where latency matters, bulk content moderation. Quality is materially below frontier on hard reasoning but adequate for narrow tasks. Cost matters: production stacks routing 80% of traffic to cheap-tier and 20% to frontier can cut total cost 5-10× with minimal quality loss when the routing is evaluated and tuned.

When to use cheap-tier model

Common mistakes

FAQ

What is cheap-tier model?

A cheap-tier model is the small-fast LLM each major provider ships alongside their frontier model — Claude Haiku, GPT-4o-mini, Gemini Flash, Mistral Small, DeepSeek V3 — used for routing, classification, extraction, and bulk inference.

When should I use cheap-tier model?

Router LLMs. Classification + extraction at scale. Real-time voice + latency-critical apps.

What are the most common mistakes with cheap-tier model?

Routing too much to cheap tier without evals — quality drift goes unnoticed. Using cheap tier for hard reasoning — frontier still wins by a wide margin.

Last updated: 2026-06-01. Raw markdown: https://promtable.com/glossary/cheap-tier-model.md.