Background
Lightning v3.1 is Smallest.ai's flagship text to speech model, a 44.1 kHz engine built for conversational agents with instant voice cloning from a 5 to 15 second sample. It reads 12 languages with automatic language detection and mid-sentence code-switching, and its launch benchmark published head to head listener wins on EmergentTTS against GPT-4o-mini TTS, ElevenLabs Turbo v2.5 and Multilingual v2, and Cartesia Sonic-3.
Sources: docs.smallest.ai, smallest.ai
At a glance
The first Smallest.ai entry on the Index, served through the unified Waves API with HTTP, SSE, and WebSocket access. Pay-as-you-go pricing sits at roughly $14.50 per 1M characters, among the lowest rates in the field. In our 50 trial streaming benchmark it returned first audio in a median of 420 ms including network time.
Sources: smallest.ai, docs.smallest.ai
Position in the rankings
Standings as of Jun 13, 2026, 02:28 UTC
Frequently asked questions
- How is Lightning v3.1 tested on the Humanness Index™?
- Listeners hear Lightning v3.1 against another model in a blind head to head round, both voices reading the same customer support prompt from the same cloned source voice, and they pick whichever sounds more human. Its Humanness score derives purely from those votes.
- Which languages does Lightning v3.1 support?
- The model card lists 12 languages spanning English, Hindi, Spanish, and nine Indian languages, with automatic language identification and mid-sentence code-switching; the Lightning V3 launch post lists 15 including French, German, Italian, Dutch, Swedish, and Portuguese.
How human does your model really sound?
The benchmark is open source. Suggest a model, read the methodology, or ask us to put your voice in the arena.