Video wird geladen...

Video konnte nicht geladen werden

Zur Startseite

Kyutai released their Streaming Text to Speech model, ~2B param model, ultra low latency (220ms), CC-BY-4.0 license 🔥 Trained on 2.5 Million Hours of audio, it can serve up to 32 users w/ less than 350ms latency on a SINGLE L40 🤯 Incredible release by kyutai folks, go check...

93,512 Aufrufe • vor 11 Monaten •via X (Twitter)

6 Kommentare

Profilbild von Vaibhav (VB) Srivastav
Vaibhav (VB) Srivastavvor 11 Monaten

Check out their models here:

Profilbild von Aakash
Aakashvor 11 Monaten

"Trained on 2.5 Million Hours of audio, it can serve up to 32 users w/ less than 350ms latency on a SINGLE L40" can we get more of this benchmark

Profilbild von KD
KDvor 11 Monaten

These are some of the same guys who run a really amazing YT channel about CS btw:

Profilbild von ZAZO
ZAZOvor 11 Monaten

that’s the best thing happened in 2025 🔥🔥🔥🔥🔥🔥🔥🔥🔥

Profilbild von Bui Dinh Ngoc
Bui Dinh Ngocvor 11 Monaten

This is game-changing for accessibility tools. I've been waiting for low-latency TTS that doesn't break the bank or require proprietary licenses.

Profilbild von Carlos DP
Carlos DPvor 11 Monaten

SUCH a solid demo lol, S tier

Ähnliche Videos