Skip to content
The Humanness Index™
Built by VapiGitHub

The Humanness Index™

The open benchmark for how human voice AI sounds. Built and operated by Vapi.

MethodologyGitHubContactvapi.ai

Code is Apache-2.0. Standings data is CC BY 4.0. Audio clips and source voices are licensed recordings, all rights reserved. Provider logomarks belong to their respective owners and are used nominatively. “The Humanness Index™” name and logo are Vapi trademarks; see TRADEMARKS.md.

  1. Humanness Index™
  2. Cartesia
  3. Sonic 2

Humanness Index™ · TTS model

Cartesia

Sonic 2

by Cartesia

Sonic 2 was Cartesia's second generation voice model, announced in early 2025 alongside the company's $64M Series A.

Rank
#18
Humanness
44
Likely rank
#12–18
Blind votes
112

Standings as of Jun 13, 2026, 01:49 UTC

LowerHigher

A real arena clip: a cloned source voice reading a customer support prompt at phone quality.

Sonic 2 key stats

Latency (measured)
159 ms1
Languages
152
Price / 1M chars
$503
Released
March 20254
  1. Vapi streaming benchmark (50 trials per model) (checked 2026-06-10) Median of 50 sequential live streaming trials, June 2026; includes network RTT from the benchmark machine.
  2. docs.cartesia.ai/build-with-cartesia/tts-models/latest (checked 2026-06-10)
  3. cartesia.ai/pricing (checked 2026-06-10) 1 credit per character (docs.cartesia.ai/pricing); entry self-serve Pro plan is $5/mo for 100K credits, a $50 per 1M effective rate; larger plans drop to $37-39 per 1M. Same credit rate for every Sonic.
  4. cartesia.ai/blog/sonic (checked 2026-06-10, confidence: medium) Announced alongside the $64M Series A in early 2025.

Background

Sonic 2 was Cartesia's second generation voice model, announced in early 2025 alongside the company's $64M Series A. It cut time to first audio to roughly 90 ms, added a Turbo variant that reached 40 ms, and supported 15 languages. It established Sonic as one of the fastest production text to speech APIs before Sonic 3 took over as the flagship.

Sources: cartesia.ai

At a glance

The fastest production TTS of its moment, with a 40 ms Turbo variant. In our 50 trial streaming benchmark Sonic 2 returned first audio in a median of 159 ms including network time.

Sources: docs.cartesia.ai

Position in the rankings

Standings as of Jun 13, 2026, 01:49 UTC

RankProviderModelHumannessLatency
#16ElevenLabsElevenLabsMultilingual v2591006 ms
#17ElevenLabsElevenLabsFlash v259226 ms
#18CartesiaCartesiaSonic 244159 ms
#19CartesiaCartesiaSonic 327166 ms
#20GradiumGradiumGradium TTS24332 ms

See the full Humanness Index™ rankings

Frequently asked questions

How is Sonic 2 tested on the Humanness Index™?
Listeners hear Sonic 2 against another model in a blind head to head round, both voices reading the same customer support prompt from the same cloned source voice, and they pick whichever sounds more human. Its Humanness score derives purely from those votes.
What latency does Sonic 2 have?
Cartesia published roughly 90 ms time to first audio at launch, with a 40 ms Turbo variant. In our 50 trial streaming benchmark it returned first audio in a median of 159 ms including network time.

Keep exploring

CartesiaCartesiaAll Cartesia models on the IndexCartesiaSonicRank #21 · Humanness 0CartesiaSonic 3Rank #19 · Humanness 27CartesiaSonic 3.5Rank #3 · Humanness 84

Back to the Humanness Index™

How human does your model really sound?

The benchmark is open source. Suggest a model, read the methodology, or ask us to put your voice in the arena.

Add your modelStar on GitHub