Model router policy
A model router policy is the rule set that decides which model handles each request — usually as a chain of conditions (intent, latency budget, cost ceiling, quality required) over the available model set.
Single-model production stacks are increasingly the exception. A model router policy formalises which model handles which request: "intent=code → Claude 4.6", "intent=chat AND short → GPT-4o-mini", "intent=research AND long-context → Gemini 2 Pro". Policies live as YAML / JSON config, executable Python, or compiled rules in a routing framework (OpenRouter, Portkey, Martian). Maturity in 2026 has the policy be evaluated on real traffic — sample queries, score outcomes per model, adjust the policy. Without an explicit policy, routing decisions live in tribal knowledge and drift over time.
When to use model router policy
- Multi-skill production assistants.
- Cost-sensitive deployments using router LLMs.
Common mistakes
- Hard-coded routing rules without evals — quality drift goes unnoticed.
- Policy too granular — coordination overhead exceeds the cost saving.
FAQ
What is model router policy?
A model router policy is the rule set that decides which model handles each request — usually as a chain of conditions (intent, latency budget, cost ceiling, quality required) over the available model set.
When should I use model router policy?
Multi-skill production assistants. Cost-sensitive deployments using router LLMs.
What are the most common mistakes with model router policy?
Hard-coded routing rules without evals — quality drift goes unnoticed. Policy too granular — coordination overhead exceeds the cost saving.
Related terms
- Model router — A model router picks which language model handles each request based on cost, latency, or task type — the standard production pattern in 2026.
- OpenRouter — OpenRouter is a unified API that lets you call 200+ language models through one endpoint with one API key — the de-facto model-router infrastructure layer in 2026.
- Evals (LLM evaluations) — Evals are systematic tests that measure how well a language model or LLM-powered system performs on a defined task using a golden set of inputs and reference outputs.
- Router fallback — A router fallback is a chain of model providers that the application tries in order — failing over from primary to secondary to tertiary on 429s, 500s, or quality thresholds.
Last updated: 2026-06-01. Raw markdown: https://promtable.com/glossary/model-router-policy.md.