# Mixture of Depths

**Source:** https://promtable.com/glossary/mixture-of-depths

> Mixture of Depths (MoD) is an efficiency technique where the model learns to skip some layers for some tokens — applying compute selectively based on token importance.

---
Mixture of Depths (MoD) is an efficiency technique where the model learns to skip some layers for some tokens — applying compute selectively based on token importance.

Introduced by Google in 2024 and adopted in production architectures by 2026, Mixture of Depths complements Mixture of Experts. Where MoE selects different experts per token, MoD selects different depths — easy tokens skip layers, hard tokens get the full stack. The result: matched quality at lower average compute. Combined with MoE you get "per-token routing across both expert and depth", which is part of how 2026 frontier models keep getting more capable without proportionally more compute.

## Common mistakes

- Conflating MoD with MoE — they are complementary, not interchangeable.

## Related terms

- [mixture-of-experts](https://promtable.com/glossary/mixture-of-experts)
- [reasoning-model](https://promtable.com/glossary/reasoning-model)
- [speculative-decoding](https://promtable.com/glossary/speculative-decoding)

## Sources

- [Mixture-of-Depths (arXiv)](https://arxiv.org/abs/2404.02258)

*Last updated: 2026-06-01*
---

Original page: https://promtable.com/glossary/mixture-of-depths
Maintained by Promtable (https://promtable.com). Content: CC BY 4.0. Cite as "Promtable — https://promtable.com/glossary/mixture-of-depths".
Contact: info@vibecodingturkey.com.