concept

Barge-in

Barge-in is the voice-agent feature where the user can interrupt the assistant mid-response — the assistant detects the speech and stops talking — making conversations feel natural instead of robotic turn-taking.

By 2026 barge-in is non-negotiable for production voice agents. The user starts speaking; the agent's voice activity detector (VAD) detects it within ~50-100ms; TTS halts; STT begins on the user's interrupting speech. Cartesia, ElevenLabs, OpenAI Realtime, and others all expose barge-in primitives. Implementation gotchas: distinguish actual barge-in from background noise; handle partial agent speech (mid-sentence cut should leave the agent in a coherent state); resume gracefully if the interruption was brief. Voice agents without barge-in feel painfully like IVR.

When to use barge-in

Common mistakes

FAQ

What is barge-in?

Barge-in is the voice-agent feature where the user can interrupt the assistant mid-response — the assistant detects the speech and stops talking — making conversations feel natural instead of robotic turn-taking.

When should I use barge-in?

Realtime voice agents. Conversational interfaces.

What are the most common mistakes with barge-in?

Triggering barge-in on background noise — annoys the user. No graceful handling of mid-sentence interruption.

Last updated: 2026-06-01. Raw markdown: https://promtable.com/glossary/barge-in.md.