正在加载视频...
视频加载失败
Kyutai Speech-To-Text is now open-source! It’s streaming, supports batched inference, and runs blazingly fast: perfect for interactive applications. Check out the details here:
9 条评论

Today we are releasing two models. The first one is a 2.6B English-only model that beats Whisper Large v3 on benchmarks even though it’s a streaming model that doesn’t process all the audio at once. It can process 400 sequences in parallel on a single H100.

The other model is a lightweight English/French 1B model optimized for real-time voice chat apps like It comes with a semantic voice activity detector that predicts if you’re done talking or just pausing mid-sentence. The open-source releases of Kyutai Text-To-Speech and will follow soon!

Magnifique !

This is great!! Well cover on @thursdai_pod on an hour

That is really good. Well done :)

Interesting... Great conversation with this ai. Wondering about potential opportunities to embed this functionality into apps...

It needs mooore languages

@dankvr finally

👏

