# Reranker

**Source:** https://promtable.com/glossary/reranker

> A reranker is a small cross-encoder model that takes a query + a candidate document and outputs a relevance score — used as the second stage after embedding retrieval to push the right answer to the top.

---
A reranker is a small cross-encoder model that takes a query + a candidate document and outputs a relevance score — used as the second stage after embedding retrieval to push the right answer to the top.

Two-stage retrieval pipelines work like this: stage 1 uses fast vector / BM25 search to fetch the top 50-200 candidates; stage 2 runs a reranker over each (query, candidate) pair to produce final ordering. Rerankers are slower per pair than embeddings (cross-attention over both texts vs separate encoding) but materially higher quality — they see query + document together. In 2026 production: Cohere Rerank 3, Voyage Rerank, Jina Reranker v2, BAAI bge-reranker, ColBERT. Reranking improves RAG accuracy 10-30% over embeddings alone with sub-100ms latency added per query. Cost matters — reranking 100 candidates per query at scale adds up.

## When to use

- Any production RAG pipeline above toy scale.
- When embedding-only top-K is too noisy.

## Common mistakes

- Reranking too many candidates — diminishing returns past top 100.
- Skipping rerank in production — pure embedding search caps at ~70% recall@10 on hard queries.

## Related terms

- [embeddings](https://promtable.com/glossary/embeddings)
- [hybrid-search](https://promtable.com/glossary/hybrid-search)
- [bm25](https://promtable.com/glossary/bm25)

## Sources

- [Cohere Rerank docs](https://docs.cohere.com/docs/rerank-overview)

*Last updated: 2026-06-01*
---

Original page: https://promtable.com/glossary/reranker
Maintained by Promtable (https://promtable.com). Content: CC BY 4.0. Cite as "Promtable — https://promtable.com/glossary/reranker".
Contact: info@vibecodingturkey.com.