Skip to content
The Humanness Index™
Built by VapiGitHub

The Humanness Index™

The open benchmark for how human voice AI sounds. Built and operated by Vapi.

MethodologyGitHubContactvapi.ai

Code is Apache-2.0. Standings data is CC BY 4.0. Audio clips and source voices are licensed recordings, all rights reserved. Provider logomarks belong to their respective owners and are used nominatively. “The Humanness Index™” name and logo are Vapi trademarks; see TRADEMARKS.md.

  1. Humanness Index™
  2. Cartesia
  3. Sonic 3.5

Humanness Index™ · TTS model

Cartesia

Sonic 3.5

by Cartesia

Sonic 3.5 is Cartesia's current flagship, released in May 2026.

Rank
#3
Humanness
82
Likely rank
#3–15
Blind votes
98

Standings as of Jun 13, 2026, 00:15 UTC

LowerHigher

A real arena clip: a cloned source voice reading a customer support prompt at phone quality.

Sonic 3.5 key stats

Latency (measured)
128 ms1
Languages
422
Price / 1M chars
$503
Released
May 4, 20264
  1. Vapi streaming benchmark (50 trials per model) (checked 2026-06-10) Median of 50 sequential live streaming trials, June 2026; includes network RTT from the benchmark machine.
  2. docs.cartesia.ai (checked 2026-06-10) Sonic 3 and 3.5; earlier Sonic and Sonic 2 shipped 15, encoded per model.
  3. cartesia.ai/pricing (checked 2026-06-10) 1 credit per character (docs.cartesia.ai/pricing); entry self-serve Pro plan is $5/mo for 100K credits, a $50 per 1M effective rate; larger plans drop to $37-39 per 1M. Same credit rate for every Sonic.
  4. docs.cartesia.ai/build-with-cartesia/tts-models/latest (checked 2026-06-10) Snapshot release.

Background

Sonic 3.5 is Cartesia's current flagship, released in May 2026. Cartesia positions it as its most natural and fastest model, with sub 90 ms latency and native support for 42 languages. It is tuned for production agent transcripts: it reads order numbers, emails, and confirmation codes correctly without preprocessing, and it resolves heteronyms like read and bow from the surrounding words.

Sources: docs.cartesia.ai

At a glance

Alphanumerics and heteronyms without preprocessing, 42 languages, and a published sub 90 ms latency claim. In our 50 trial streaming benchmark it returned first audio in a median of 128 ms, the fastest measured time among current generation models on the Index.

Sources: docs.cartesia.ai

Position in the rankings

Standings as of Jun 13, 2026, 00:15 UTC

RankProviderModelHumannessLatency
#1xAIxAIGrok TTS100460 ms
#2xAIxAIGrok TTS (Streaming)98285 ms
#3CartesiaCartesiaSonic 3.582128 ms
#4Canopy LabsCanopy LabsOrpheus78—
#5ElevenLabsElevenLabsEleven v376758 ms

See the full Humanness Index™ rankings

Frequently asked questions

How is Sonic 3.5 tested on the Humanness Index™?
Listeners hear Sonic 3.5 against another model in a blind head to head round, both voices reading the same customer support prompt from the same cloned source voice, and they pick whichever sounds more human. Its Humanness score derives purely from those votes.
How fast is Sonic 3.5?
Cartesia publishes sub 90 ms latency. In our 50 trial streaming benchmark it returned first audio in a median of 128 ms including network time from the benchmark machine.

Keep exploring

CartesiaCartesiaAll Cartesia models on the IndexCartesiaSonicRank #21 · Humanness 0CartesiaSonic 2Rank #18 · Humanness 44CartesiaSonic 3Rank #20 · Humanness 23

Back to the Humanness Index™

How human does your model really sound?

The benchmark is open source. Suggest a model, read the methodology, or ask us to put your voice in the arena.

Add your modelStar on GitHub