# Voice agent platform

**Source:** https://promtable.com/glossary/voice-agent-platform

> A voice agent platform is a managed stack that combines STT + LLM + TTS + telephony into a single API for building production phone / voice agents — Vapi, Retell, Bland are the 2026 leaders.

---
A voice agent platform is a managed stack that combines STT + LLM + TTS + telephony into a single API for building production phone / voice agents — Vapi, Retell, Bland are the 2026 leaders.

Building voice agents from scratch in 2026 requires: streaming STT (Deepgram, AssemblyAI), low-latency LLM with [[tool-call-streaming]] (Claude, GPT, Groq), streaming TTS (ElevenLabs, Cartesia), interrupt handling, VAD, turn-taking, phone integration (Twilio, Vonage), call recording, transcripts, evals. Voice agent platforms bundle all of it. Trade-offs: speed-to-market vs vendor lock-in, cost-per-minute vs raw token cost, opinionated turn-taking vs custom control. By 2026 Vapi, Retell, Bland, Synthflow, plus open-source LiveKit Agents + Pipecat dominate. Sub-600ms round-trip latency is the production bar.

## When to use

- Building production phone agents.
- Voice apps where speed-to-market matters.

## Common mistakes

- Building from scratch — voice agent infra is 3+ months of work; platforms ship in days.
- Skipping latency testing — anything over 1s round-trip kills the UX.

## Related terms

- [barge-in](https://promtable.com/glossary/barge-in)
- [vad](https://promtable.com/glossary/vad)
- [streaming-stt](https://promtable.com/glossary/streaming-stt)

*Last updated: 2026-06-01*
---

Original page: https://promtable.com/glossary/voice-agent-platform
Maintained by Promtable (https://promtable.com). Content: CC BY 4.0. Cite as "Promtable — https://promtable.com/glossary/voice-agent-platform".
Contact: info@vibecodingturkey.com.