# Speaker diarisation

**Source:** https://promtable.com/glossary/diarisation

> Speaker diarisation is the technique of segmenting an audio recording by who-spoke-when — answering "who said what" rather than just "what was said" — used heavily in meeting transcription, podcasts, and call analytics.

---
Speaker diarisation is the technique of segmenting an audio recording by who-spoke-when — answering "who said what" rather than just "what was said" — used heavily in meeting transcription, podcasts, and call analytics.

Diarisation segments audio into per-speaker turns: "Speaker 1: hello", "Speaker 2: hi". Production STT platforms (Deepgram, AssemblyAI, Google STT) ship diarisation as a configurable option. Quality varies — clean two-speaker calls are well-handled; messy multi-speaker meetings with overlapping speech remain hard. Pair with speaker identification (matching a diarised speaker to a known voice from a sample) for full speaker labels. Used in meeting summaries (Otter, Read.ai, Granola), call centre analytics, podcast transcription, and forensic audio analysis.

## When to use

- Meeting transcription.
- Podcast / interview transcription.
- Call centre analytics.

## Common mistakes

- Expecting accurate diarisation on heavy overlapping speech — current models struggle.
- Not pairing with speaker identification — "Speaker 3" labels are useless without names.

## Related terms

- [voice](https://promtable.com/glossary/voice)
- [streaming-stt](https://promtable.com/glossary/streaming-stt)

*Last updated: 2026-06-01*
---

Original page: https://promtable.com/glossary/diarisation
Maintained by Promtable (https://promtable.com). Content: CC BY 4.0. Cite as "Promtable — https://promtable.com/glossary/diarisation".
Contact: info@vibecodingturkey.com.