Flash v2 key stats
- Latency (measured)
- 226 ms1
- Released
- December 18, 20244
- Vapi streaming benchmark (50 trials per model) (checked 2026-06-10) Median of 50 sequential live streaming trials, June 2026; includes network RTT from the benchmark machine.
- elevenlabs.io/docs/overview/models (checked 2026-06-10)
- elevenlabs.io/pricing/api (checked 2026-06-10) ElevenAPI pay-as-you-go: Flash/Turbo $0.05 per 1k characters = $50 per 1M. Eleven v3 and Multilingual v2 bill $0.10 per 1k ($100 per 1M), encoded per model.
- elevenlabs.io/blog/meet-flash (checked 2026-06-10)
Background
Flash v2 is ElevenLabs' ultra low latency English model, announced in December 2024 with speech generation around 75 ms plus network overhead. It trades a little of the Turbo family's expressiveness for speed, and ElevenLabs recommends it for real time conversational agents that only need English.
Sources: elevenlabs.io
At a glance
Flash v2 is the fastest English path on the ElevenLabs platform. In our 50 trial streaming benchmark it returned first audio in a median of 226 ms including network time from the benchmark machine.
Sources: elevenlabs.io
Position in the rankings
Standings as of Jun 13, 2026, 00:15 UTC
Frequently asked questions
- How is Flash v2 tested on the Humanness Index™?
- Listeners hear Flash v2 against another model in a blind head to head round, both voices reading the same customer support prompt from the same cloned source voice, and they pick whichever sounds more human. Its Humanness score derives purely from those votes.
- What latency does Flash v2 have?
- ElevenLabs publishes roughly 75 ms generation time. In our own 50 trial streaming benchmark, which includes network time from the benchmark machine, Flash v2 returned first audio in a median of 226 ms.
How human does your model really sound?
The benchmark is open source. Suggest a model, read the methodology, or ask us to put your voice in the arena.