# LLM gateway

**Source:** https://promtable.com/glossary/llm-gateway

> An LLM gateway is the proxy layer between your app and one-or-many LLM providers — handles routing, fallback, caching, cost tracking, rate limiting, and observability. OpenRouter, LiteLLM, Portkey, Helicone, Cloudflare AI Gateway are 2026 leaders.

---
An LLM gateway is the proxy layer between your app and one-or-many LLM providers — handles routing, fallback, caching, cost tracking, rate limiting, and observability. OpenRouter, LiteLLM, Portkey, Helicone, Cloudflare AI Gateway are 2026 leaders.

Calling LLM APIs directly works for prototypes. Production apps quickly need: model fallback when the primary errors, request caching for repeat prompts, per-team cost caps, audit logs for compliance, rate limiting against runaway loops, multi-vendor routing for cost or quality. An LLM gateway centralizes all of this — the app talks to one OpenAI-compatible endpoint, the gateway handles the rest. Trade-offs: hosted gateways (OpenRouter, Portkey) are zero-ops but add latency + margin; self-host (LiteLLM) gives full control + lower cost but operational burden; SDK-only (Vercel AI SDK) avoids the proxy but loses central governance. Most production AI stacks above small scale have an LLM gateway.

## When to use

- Any production AI app at non-trivial scale.

## Common mistakes

- Putting the gateway on a slow region — adds latency to every LLM call.
- No fallback configured — primary outage takes down your app.

## Related terms

- [model-router](https://promtable.com/glossary/model-router)
- [ai-router-fallback](https://promtable.com/glossary/ai-router-fallback)
- [cost-attribution](https://promtable.com/glossary/cost-attribution)

*Last updated: 2026-06-01*
---

Original page: https://promtable.com/glossary/llm-gateway
Maintained by Promtable (https://promtable.com). Content: CC BY 4.0. Cite as "Promtable — https://promtable.com/glossary/llm-gateway".
Contact: info@vibecodingturkey.com.