Skip to content
The Humanness Index™
Built by VapiGitHub

The Humanness Index™

The open benchmark for how human voice AI sounds. Built and operated by Vapi.

MethodologyGitHubContactvapi.ai

Code is Apache-2.0. Standings data is CC BY 4.0. Audio clips and source voices are licensed recordings, all rights reserved. Provider logomarks belong to their respective owners and are used nominatively. “The Humanness Index™” name and logo are Vapi trademarks; see TRADEMARKS.md.

  1. Humanness Index™
  2. MiniMax

Humanness Index™ · Provider

MiniMax

MiniMax

www.minimax.io

MiniMax is the Shanghai AI lab behind the MiniMax speech line, served through its hosted t2a_v2 API with HD and Turbo tiers.

Best ranked model
#8 Speech 2.5
Humanness
71

Standings as of Jun 13, 2026, 00:15 UTC

MiniMax
Models on the Index
3
Languages
40
Price / 1M chars
$60
Visit MiniMax

MiniMax models on the Humanness Index™

RankModelHumannessLatencyLanguagesPrice / 1M chars
#8Speech 2.571325 ms40$60
#10Speech 2 HD64357 ms32$100
#12Speech 2 Turbo63315 ms32$60

Compare against the full Humanness Index™ rankings

About MiniMax

MiniMax is the Shanghai AI lab behind the MiniMax speech line, served through its hosted t2a_v2 API with HD and Turbo tiers. It is widely regarded as the strongest text to speech provider for Chinese, and its recent generations brought English accuracy and rhythm up alongside that strength.

Sources: minimax.io

Speech generations

The speech line moved fast through 2025, with the Speech-02 series arriving in April and Speech 2.5 following in August. The current generation supports more than 40 languages and clones a voice from roughly six to ten seconds of reference audio using a learnable speaker encoder that needs no transcript. The clips on this Index were generated with the Speech 2.5 generation, turbo tier.

Sources: minimax.io, platform.minimax.io

MiniMax stats

Languages
401
Price / 1M chars
$602
  1. platform.minimax.io/docs/api-reference/speech-t2a-http (checked 2026-06-10) Vendor lists 40+ languages for the current speech generation.
  2. platform.minimax.io/docs/guides/pricing-paygo (checked 2026-06-10) Speech 2.5 turbo $60 per 1M characters pay-as-you-go; the arena clips are the 2.5 generation, turbo tier (matches the measured realtime latency).

Other providers on the Index

ElevenLabsElevenLabsBest ranked model #5 · Eleven v3CartesiaCartesiaBest ranked model #3 · Sonic 3.5xAIxAIBest ranked model #1 · Grok TTSGradiumGradiumBest ranked model #19 · Gradium TTSCanopy LabsCanopy LabsBest ranked model #4 · OrpheusInworldInworldBest ranked model #9 · TTS-2Smallest.aiSmallest.aiBest ranked model #11 · Lightning v3.1NeuphonicNeuphonicBest ranked model #13 · neu_hq

Back to the Humanness Index™

How human does your model really sound?

The benchmark is open source. Suggest a model, read the methodology, or ask us to put your voice in the arena.

Add your modelStar on GitHub