Voice Models
Voice is what makes an agent feel like a person. Omniflow supports three Realtime providers, each with its own catalog of voices, languages, and personality knobs.
Supported providers
| Provider | Strengths | Latency | Best for |
|---|---|---|---|
| OpenAI Realtime (GPT-Realtime) | Strong reasoning, natural prosody, broad tool support | ~250-400ms | English-first conversational support |
| Gemini Live | Excellent multilingual, low cost, fast | ~200-350ms | Multilingual deployments, high volume |
| Cascaded (STT + LLM + TTS) | Deterministic, cheap, fully customizable | ~600-900ms | Long-form replies, scripted flows |
Voice catalog
Realtime providers ship a fixed catalog of voices. You pick one per agent and can preview each in the Studio before committing.
| Voice | Provider | Vibe |
|---|---|---|
| Aria | OpenAI | Warm, mid-range, professional |
| Sage | OpenAI | Calm, neutral, slightly formal |
| Puck | Gemini | Bright, upbeat, slight rasp |
| Charon | Gemini | Deeper, calm, authoritative |
| Custom | Cascaded | Any TTS voice you bring (ElevenLabs, Azure, AWS Polly) |
Tuning behavior
Once you’ve picked a voice, the Voice settings panel exposes a small number of safe knobs:
- Speech rate — slow down for clarity, speed up for energy. ±15% is the sweet spot.
- Filler words — toggle natural ums and pauses on for warmth, off for crispness.
- Interruption sensitivity — how quickly the agent yields when the customer starts speaking.
- Silence threshold — how long the agent waits before assuming the customer is done.
Don’t crank speech rate beyond ±25%. Some voices distort or skip syllables at extreme rates, and customers perceive it as broken rather than fast.
Switching providers
You can switch providers without rewriting the prompt — Omniflow handles the model differences. Things to expect:
- Voices don’t carry across providers. Re-pick a voice after switching.
- Latency may shift. Run a few test calls to confirm it still feels right.
- Tool-calling syntax is normalized — your tools keep working.
Multilingual setup
A single agent can speak multiple languages. Add languages on the Language tab; the agent picks up the customer’s language from the channel hint or auto-detects from the first turn.
Open in Omniflow
Related
| If you want to… | Go to |
|---|---|
| Read the runtime architecture | Voice Runtime |
| Set fallback rules between providers | Reliability & Failover |
| Connect a phone number | Telephony |