Skip to content
The Humanness Index™
Built by VapiGitHub

The Humanness Index™

The open benchmark for how human voice AI sounds. Built and operated by Vapi.

MethodologyGitHubContactvapi.ai

Code is Apache-2.0. Standings data is CC BY 4.0. Audio clips and source voices are licensed recordings, all rights reserved. Provider logomarks belong to their respective owners and are used nominatively. “The Humanness Index™” name and logo are Vapi trademarks; see TRADEMARKS.md.

  1. Humanness Index™
  2. Inworld

Humanness Index™ · Provider

Inworld

Inworld

inworld.ai

Inworld builds voice models for interactive agents, and its TTS line climbed third party speech arenas through 2025, with the 1.5 generation reaching the top of the Artificial Analysis Speech Arena ahead of Google and ElevenLabs.

Best ranked model
#9 TTS-2
Humanness
69

Standings as of Jun 13, 2026, 01:14 UTC

Inworld
Models on the Index
2
Languages
15
Price / 1M chars
$35
Visit Inworld

Inworld models on the Humanness Index™

RankModelHumannessLatencyLanguagesPrice / 1M chars
#11TTS-1.5-max65337 ms15$35
#9TTS-269288 ms100+$25

Compare against the full Humanness Index™ rankings

About Inworld

Inworld builds voice models for interactive agents, and its TTS line climbed third party speech arenas through 2025, with the 1.5 generation reaching the top of the Artificial Analysis Speech Arena ahead of Google and ElevenLabs. Integration partners for its realtime models include Vapi.

Sources: inworld.ai

Model line

The 1.5 generation targets the quality and speed balance most production agents need, with a mini sibling for the lowest latency. Realtime TTS-2, released as a research preview in May 2026, conditions on the audio of prior conversation turns and holds a single voice identity across more than 100 languages.

Sources: docs.inworld.ai

Inworld stats

Languages
151
Price / 1M chars
$352
  1. docs.inworld.ai/release-notes/tts (checked 2026-06-10) TTS 1.5 generation; Realtime TTS-2 covers 100+, encoded per model.
  2. inworld.ai/pricing (checked 2026-06-10) On-demand rates: TTS 1.5 Max $35 per 1M characters; TTS-2 $25 per 1M, encoded per model.

Other providers on the Index

ElevenLabsElevenLabsBest ranked model #5 · Eleven v3CartesiaCartesiaBest ranked model #3 · Sonic 3.5xAIxAIBest ranked model #1 · Grok TTSMiniMaxMiniMaxBest ranked model #8 · Speech 2.5GradiumGradiumBest ranked model #20 · Gradium TTSCanopy LabsCanopy LabsBest ranked model #4 · OrpheusSmallest.aiSmallest.aiBest ranked model #14 · Lightning v3.1NeuphonicNeuphonicBest ranked model #13 · neu_hq

Back to the Humanness Index™

How human does your model really sound?

The benchmark is open source. Suggest a model, read the methodology, or ask us to put your voice in the arena.

Add your modelStar on GitHub