Skip to content
The Humanness Index™
Built by VapiGitHub

The Humanness Index™

The open benchmark for how human voice AI sounds. Built and operated by Vapi.

MethodologyGitHubContactvapi.ai

Code is Apache-2.0. Standings data is CC BY 4.0. Audio clips and source voices are licensed recordings, all rights reserved. Provider logomarks belong to their respective owners and are used nominatively. “The Humanness Index™” name and logo are Vapi trademarks; see TRADEMARKS.md.

  1. Humanness Index™
  2. ElevenLabs
  3. Eleven v3

Humanness Index™ · TTS model

ElevenLabs

Eleven v3

by ElevenLabs

Eleven v3 is ElevenLabs' flagship expressive model, launched in alpha in June 2025 and generally available since March 2026.

Rank
#5
Humanness
76
Likely rank
#3–15
Blind votes
102

Standings as of Jun 13, 2026, 00:15 UTC

LowerHigher

A real arena clip: a cloned source voice reading a customer support prompt at phone quality.

Eleven v3 key stats

Latency (measured)
758 ms1
Languages
70+2
Price / 1M chars
$1003
Released
March 14, 20264
  1. Vapi streaming benchmark (50 trials per model) (checked 2026-06-10) Median of 50 sequential live streaming trials, June 2026; includes network RTT from the benchmark machine.
  2. elevenlabs.io/docs/overview/models (checked 2026-06-10) 74 languages listed.
  3. elevenlabs.io/pricing/api (checked 2026-06-10) Bills at the Multilingual v2 / v3 ElevenAPI rate: $0.10 per 1k characters = $100 per 1M.
  4. elevenlabs.io/blog/eleven-v3 (checked 2026-06-10) General availability; alpha launched 2025-06-03.

Background

Eleven v3 is ElevenLabs' flagship expressive model, launched in alpha in June 2025 and generally available since March 2026. It supports more than 70 languages, multi speaker dialogue, and inline audio tags like [whispers] and [laughs] that direct emotion and delivery. ElevenLabs itself does not recommend v3 for real time conversation because of its higher latency, which makes its blind test scores here an interesting comparison against the faster models it sells alongside.

Sources: elevenlabs.io, help.elevenlabs.io

At a glance

v3 is built for expressiveness first: audio tags, dialogue mode, and the widest language coverage on the ElevenLabs platform. In our 50 trial streaming benchmark it returned first audio in a median of 758 ms, well above every realtime model on the Index.

Sources: elevenlabs.io

Position in the rankings

Standings as of Jun 13, 2026, 00:15 UTC

RankProviderModelHumannessLatency
#3CartesiaCartesiaSonic 3.582128 ms
#4Canopy LabsCanopy LabsOrpheus78—
#5ElevenLabsElevenLabsEleven v376758 ms
#6ElevenLabsElevenLabsTurbo v2.575265 ms
#7ElevenLabsElevenLabsFlash v2.572197 ms

See the full Humanness Index™ rankings

Frequently asked questions

How is Eleven v3 tested on the Humanness Index™?
Listeners hear Eleven v3 against another model in a blind head to head round, both voices reading the same customer support prompt from the same cloned source voice, and they pick whichever sounds more human. Its Humanness score derives purely from those votes.
Why is Eleven v3 slower than other ElevenLabs models?
v3 is built for expressiveness rather than real time conversation, with audio tags and multi speaker dialogue. ElevenLabs itself does not recommend it for real time agents; we measured a 758 ms median TTFB, well above every realtime model on the Index.

Keep exploring

ElevenLabsElevenLabsAll ElevenLabs models on the IndexElevenLabsTurbo v2Rank #14 · Humanness 63ElevenLabsTurbo v2.5Rank #6 · Humanness 75ElevenLabsFlash v2Rank #17 · Humanness 57ElevenLabsFlash v2.5Rank #7 · Humanness 72ElevenLabsMultilingual v2Rank #16 · Humanness 61

Back to the Humanness Index™

How human does your model really sound?

The benchmark is open source. Suggest a model, read the methodology, or ask us to put your voice in the arena.

Add your modelStar on GitHub