# MoE routing

**Source:** https://promtable.com/glossary/moe-routing

> MoE routing is the per-token gating function inside a Mixture-of-Experts model that selects which expert sub-networks process each token — the critical detail that determines MoE quality + efficiency.

---
MoE routing is the per-token gating function inside a Mixture-of-Experts model that selects which expert sub-networks process each token — the critical detail that determines MoE quality + efficiency.

Mixture-of-Experts (MoE) models contain N expert sub-networks; a router selects K (usually 2-8) per token to activate. The router's job is critical: bad routing collapses to using a few experts (defeating the point), or spreads tokens too thinly. Modern MoE training (load-balancing losses, expert-choice routing, switch routing) addresses these failure modes. By 2026 Llama 4 Maverick, Mixtral 8x22B, DBRX, DeepSeek V3 all use MoE with carefully tuned routing. From a developer's perspective MoE is mostly invisible — the API or self-host inference engine handles routing — but understanding it matters when debugging quality regressions or sizing inference compute.

## Common mistakes

- Sizing MoE inference by total parameter count — active-parameter count per token is what matters.

## Related terms

- [mixture-of-experts](https://promtable.com/glossary/mixture-of-experts)
- [mixture-of-depths](https://promtable.com/glossary/mixture-of-depths)
- [attention-mechanism](https://promtable.com/glossary/attention-mechanism)

*Last updated: 2026-06-01*
---

Original page: https://promtable.com/glossary/moe-routing
Maintained by Promtable (https://promtable.com). Content: CC BY 4.0. Cite as "Promtable — https://promtable.com/glossary/moe-routing".
Contact: info@vibecodingturkey.com.