# Retry with backoff

**Source:** https://promtable.com/glossary/retry-with-backoff

> Retry with backoff is the production resilience pattern for handling transient LLM API failures (rate limits, 5xx, timeouts) — retry with exponentially-growing delays + jitter to avoid thundering herd. Built into most AI SDKs by default in 2026.

---
Retry with backoff is the production resilience pattern for handling transient LLM API failures (rate limits, 5xx, timeouts) — retry with exponentially-growing delays + jitter to avoid thundering herd. Built into most AI SDKs by default in 2026.

LLM APIs fail transiently: rate limits hit, the model overloads, network blips. Naive immediate retry makes it worse (thundering herd). Retry with exponential backoff: wait 1s, then 2s, then 4s, then 8s, with random jitter (avoid all clients retrying in sync). Stop after N attempts. Production tuning: shorter backoff for cheap retries (rate-limit headers), longer for expensive (500 errors), per-endpoint policies (chat vs streaming vs tool use), respect Retry-After header. Most AI SDKs implement this by default; customize via timeout + max retries config. Hugely improves resilience for free — most production failures are transient + retry succeeds.

## When to use

- Any production LLM call.

## Common mistakes

- No jitter — synchronized retry storms.
- Retrying on 400 errors — bad request won't fix itself.

## Related terms

- [ai-sdk](https://promtable.com/glossary/ai-sdk)
- [rate-limit](https://promtable.com/glossary/rate-limit)
- [ai-router-fallback](https://promtable.com/glossary/ai-router-fallback)

*Last updated: 2026-06-01*
---

Original page: https://promtable.com/glossary/retry-with-backoff
Maintained by Promtable (https://promtable.com). Content: CC BY 4.0. Cite as "Promtable — https://promtable.com/glossary/retry-with-backoff".
Contact: info@vibecodingturkey.com.