Semantic routing
Semantic routing classifies an incoming query by meaning — via embedding similarity to predefined route prototypes — and dispatches it to the right model, agent, or sub-system.
Semantic routing is the cheap fast layer that decides what to do before the expensive layer runs. Embed the query, compare to a set of pre-embedded route prototypes ("customer support", "sales", "code question"), pick the closest, and route accordingly. The routing model itself is usually a small embedder rather than an LLM. Used widely in 2026 production stacks to cut LLM cost (route 70% of queries to a cheap model, only 30% to the frontier) and to gate access (refuse routes for out-of-scope queries). Libraries: semantic-router, Pinecone Route, Cohere Classify.
When to use semantic routing
- Multi-skill chatbots and agents.
- Cost-sensitive production with diverse query types.
- Gate enforcement before expensive model calls.
Common mistakes
- Too-similar route prototypes — embedding distances collapse.
- No fallback / out-of-scope route — ambiguous queries get mis-routed silently.
FAQ
What is semantic routing?
Semantic routing classifies an incoming query by meaning — via embedding similarity to predefined route prototypes — and dispatches it to the right model, agent, or sub-system.
When should I use semantic routing?
Multi-skill chatbots and agents. Cost-sensitive production with diverse query types. Gate enforcement before expensive model calls.
What are the most common mistakes with semantic routing?
Too-similar route prototypes — embedding distances collapse. No fallback / out-of-scope route — ambiguous queries get mis-routed silently.
Related terms
- Embeddings — Embeddings are dense numeric vectors that represent the meaning of text, images, or other data, allowing similarity to be measured as vector distance.
- Model router — A model router picks which language model handles each request based on cost, latency, or task type — the standard production pattern in 2026.
- Tool router — A tool router is a layer in an agent that decides which tool to call (or which sub-agent to delegate to) for a given step — distinct from a model router which picks the underlying LLM.
- Semantic search — Semantic search finds documents by meaning rather than keyword match, using embedding similarity in a vector space.
Last updated: 2026-06-01. Raw markdown: https://promtable.com/glossary/semantic-routing.md.