concept

ANN index

An ANN (approximate nearest neighbor) index is the data structure inside a vector DB that returns 'almost-best' matches in sub-millisecond time — HNSW, IVF, ScaNN, DiskANN are 2026 popular implementations.

Exact nearest-neighbor search is O(N) — scan every vector. At 100M vectors, that's seconds per query. ANN indexes trade a small accuracy loss (typically 90-99% recall@10) for 100-1000× speedup. Common families: HNSW (Hierarchical Navigable Small World — fast, accurate, RAM-heavy, default in most modern DBs), IVF (Inverted File — clusters then searches within cluster — faster builds, larger), ScaNN (Google's hybrid — strong recall at low latency), DiskANN (disk-friendly for billion-scale). Tuning matters: HNSW's ef_construction + M parameters trade build time vs query speed vs accuracy. Modern vector DBs hide most of this; production teams still benchmark on their dataset to choose.

When to use ann index

Common mistakes

FAQ

What is ann index?

An ANN (approximate nearest neighbor) index is the data structure inside a vector DB that returns 'almost-best' matches in sub-millisecond time — HNSW, IVF, ScaNN, DiskANN are 2026 popular implementations.

When should I use ann index?

Any vector search > 1M vectors.

What are the most common mistakes with ann index?

Picking default params blind — workload-tuned ANN can be 5× faster. Not measuring recall — silent quality regressions.

Last updated: 2026-06-01. Raw markdown: https://promtable.com/glossary/ann-index.md.