Humanness Index™ · TTS model

ElevenLabs Eleven v3

Eleven v3 is ElevenLabs' flagship expressive model, launched in alpha in June 2025 and generally available since March 2026.

Rank: #1
Humanness: 96
Likely rank: #1–5
Blind votes: 1,051

Standings as of Jul 28, 2026, 01:54 UTC

A real arena clip: a cloned source voice reading a customer support prompt at phone quality.

Eleven v3 key stats

Latency (measured): 758 ms¹
Languages: 70+²
Price / 1M chars: $100³
Released: March 14, 2026⁴

Vapi streaming benchmark (50 trials per model) (checked 2026-06-10) Median of 50 sequential live streaming trials, June 2026; includes network RTT from the benchmark machine.
elevenlabs.io/docs/overview/models (checked 2026-06-10) 74 languages listed.
elevenlabs.io/pricing/api (checked 2026-06-10) Bills at the Multilingual v2 / v3 ElevenAPI rate: $0.10 per 1k characters = $100 per 1M.
elevenlabs.io/blog/eleven-v3 (checked 2026-06-10) General availability; alpha launched 2025-06-03.

Background

Eleven v3 is ElevenLabs' flagship expressive model, launched in alpha in June 2025 and generally available since March 2026. It supports more than 70 languages, multi speaker dialogue, and inline audio tags like [whispers] and [laughs] that direct emotion and delivery. ElevenLabs itself does not recommend v3 for real time conversation because of its higher latency, which makes its blind test scores here an interesting comparison against the faster models it sells alongside.

Sources: elevenlabs.io, help.elevenlabs.io

At a glance

v3 is built for expressiveness first: audio tags, dialogue mode, and the widest language coverage on the ElevenLabs platform. In our 50 trial streaming benchmark it returned first audio in a median of 758 ms, well above every realtime model on the Index.

Sources: elevenlabs.io

Position in the rankings

Standings as of Jul 28, 2026, 01:54 UTC

Rank	Provider	Model	Humanness	Latency
Baseline	Human	Homo Sapien	100	—
#1	ElevenLabs	Eleven v3	96	758 ms
#2	xAI	Grok TTS	94	460 ms
#3	MiniMax	Speech 2.8	91	325 ms

See the full Humanness Index™ rankings

Frequently asked questions

How is Eleven v3 tested on the Humanness Index™?: Listeners hear Eleven v3 against another model in a blind head to head round, both voices reading the same customer support prompt from the same cloned source voice, and they pick whichever sounds more human. Its Humanness score derives purely from those votes.
Why is Eleven v3 slower than other ElevenLabs models?: v3 is built for expressiveness rather than real time conversation, with audio tags and multi speaker dialogue. ElevenLabs itself does not recommend it for real time agents; we measured a 758 ms median TTFB, well above every realtime model on the Index.

Keep exploring

ElevenLabsAll ElevenLabs models on the Index Turbo v2Rank #10 · Humanness 75 Turbo v2.5Latency 265 ms Flash v2Rank #9 · Humanness 76 Flash v2.5Rank #14 · Humanness 68 Multilingual v2Latency 1006 ms

Back to the Humanness Index™

Find the most human-sounding voice for your agent.

Compare the models in blind tests, read the methodology, or get in touch.

Read the methodology Star on GitHub

Build a TTS model? Add yours to the Index.

Eleven v3 key stats

Latency (measured)

758 ms¹

Languages

70+²

Price / 1M chars

$100³

Released

March 14, 2026⁴

Vapi streaming benchmark (50 trials per model) (checked 2026-06-10) Median of 50 sequential live streaming trials, June 2026; includes network RTT from the benchmark machine.

elevenlabs.io/docs/overview/models (checked 2026-06-10) 74 languages listed.

elevenlabs.io/pricing/api (checked 2026-06-10) Bills at the Multilingual v2 / v3 ElevenAPI rate: $0.10 per 1k characters = $100 per 1M.

elevenlabs.io/blog/eleven-v3 (checked 2026-06-10) General availability; alpha launched 2025-06-03.

Background

Rank

Provider

Model

Humanness

Latency

Baseline

Human

Homo Sapien

100

—

ElevenLabs

Eleven v3

758 ms

xAI

Grok TTS

460 ms

MiniMax

Speech 2.8

325 ms

Frequently asked questions

How is Eleven v3 tested on the Humanness Index™?

Listeners hear Eleven v3 against another model in a blind head to head round, both voices reading the same customer support prompt from the same cloned source voice, and they pick whichever sounds more human. Its Humanness score derives purely from those votes.

Why is Eleven v3 slower than other ElevenLabs models?

v3 is built for expressiveness rather than real time conversation, with audio tags and multi speaker dialogue. ElevenLabs itself does not recommend it for real time agents; we measured a 758 ms median TTFB, well above every realtime model on the Index.