Video wird geladen...
Video konnte nicht geladen werden
Kyutai released their Streaming Text to Speech model, ~2B param model, ultra low latency (220ms), CC-BY-4.0 license 🔥 Trained on 2.5 Million Hours of audio, it can serve up to 32 users w/ less than 350ms latency on a SINGLE L40 🤯 Incredible release by kyutai folks, go check... show more
93,512 Aufrufe • vor 11 Monaten •via X (Twitter)
6 Kommentare

Vaibhav (VB) Srivastavvor 11 Monaten
Check out their models here:

Aakashvor 11 Monaten
"Trained on 2.5 Million Hours of audio, it can serve up to 32 users w/ less than 350ms latency on a SINGLE L40" can we get more of this benchmark

KDvor 11 Monaten
These are some of the same guys who run a really amazing YT channel about CS btw:

ZAZOvor 11 Monaten
that’s the best thing happened in 2025 🔥🔥🔥🔥🔥🔥🔥🔥🔥

Bui Dinh Ngocvor 11 Monaten
This is game-changing for accessibility tools. I've been waiting for low-latency TTS that doesn't break the bank or require proprietary licenses.

Carlos DPvor 11 Monaten
SUCH a solid demo lol, S tier
