Video yükleniyor...

Video Yüklenemedi

Ana Sayfaya Dön

Smol TTS keeps getting better! Introducing OuteTTS v0.2 - 500M parameters, multilingual with voice cloning! 🔥 > Multilingual - English, Chinese, Korean & Japanese > Cross platform inference w/ llama.cpp > Zero-shot voice cloning > Trained on 5 Billion audio tokens > Qwen 2.5 0.5B LLM backbone > Trained...

44,654 görüntüleme • 1 yıl önce •via X (Twitter)

11 Yorum

Vaibhav (VB) Srivastav profil fotoğrafı
Vaibhav (VB) Srivastav1 yıl önce

Check out the model weights and inference code base here:

Vaibhav (VB) Srivastav profil fotoğrafı
Vaibhav (VB) Srivastav1 yıl önce

llama.cpp compatible GGUFs:

Vaibhav (VB) Srivastav profil fotoğrafı
Vaibhav (VB) Srivastav1 yıl önce

OuteTTS GitHub:

Haorui He profil fotoğrafı
Haorui He1 yıl önce

Big Congrats!!! Another SOTA TTS model trained on Emilia after F5-TTS & MaskGCT! Try out:

Tommy Falkowski profil fotoğrafı
Tommy Falkowski1 yıl önce

Just tested it out and the quality is very good. More importantly, the fact that you can generate speaker profiles is awesome! Will test it out some more and add it to my growing list of supported tts engines in my app 🤣

SkyTab profil fotoğrafı
SkyTab1 yıl önce

Switch to SkyTab and get $5,000! A modern and sleek POS system with commercial-grade durability. 💪 ✅ $0 upfront costs ✅ Best in-class POS ✅ Local service & 24/7 support ✅ And much more! Make the switch today:

Umesh profil fotoğrafı
Umesh1 yıl önce

This is improving so fast that I don't want to speak myself anymore. Just use this and get done 🤖

Fronesis profil fotoğrafı
Fronesis1 yıl önce

Thank you for your work and for sharing insights! 🙌 Advancements like OuteTTS v0.2 showcase the rapid evolution of AI and its potential to empower global communities. 🚀 The future of #AI is bright, and collaborative innovation is key to unlocking its full potential!

Digital Doctor profil fotoğrafı
Digital Doctor1 yıl önce

Are you saying you can voice CLONE on a R-Pi? Is that what you're saying????

斎藤ただし, Tadashi Saito profil fotoğrafı
斎藤ただし, Tadashi Saito1 yıl önce

The font of Japanese characters is wrong, it's for (maybe) Chinese. I hope you'll pay attention and respect to each of them when you are working for multilingual/multicultural things. (like your TTS engine itself does. Brilliant quality✨️)

Ahmed Mansour profil fotoğrafı
Ahmed Mansour1 yıl önce

I tried to run it on HF. average inference time for 200 chars is >1 hour running on CPU. Why is this model so heavy?

Benzer Videolar