Loading video...

Video Failed to Load

Go Home

Smol TTS keeps getting better! Introducing OuteTTS v0.2 - 500M parameters, multilingual with voice cloning! 🔥 > Multilingual - English, Chinese, Korean & Japanese > Cross platform inference w/ llama.cpp > Zero-shot voice cloning > Trained on 5 Billion audio tokens > Qwen 2.5 0.5B LLM backbone > Trained...

44,654 views • 1 year ago •via X (Twitter)

11 Comments

Vaibhav (VB) Srivastav's profile picture
Vaibhav (VB) Srivastav1 year ago

Check out the model weights and inference code base here:

Vaibhav (VB) Srivastav's profile picture
Vaibhav (VB) Srivastav1 year ago

llama.cpp compatible GGUFs:

Vaibhav (VB) Srivastav's profile picture
Vaibhav (VB) Srivastav1 year ago

OuteTTS GitHub:

Haorui He's profile picture
Haorui He1 year ago

Big Congrats!!! Another SOTA TTS model trained on Emilia after F5-TTS & MaskGCT! Try out:

Tommy Falkowski's profile picture
Tommy Falkowski1 year ago

Just tested it out and the quality is very good. More importantly, the fact that you can generate speaker profiles is awesome! Will test it out some more and add it to my growing list of supported tts engines in my app 🤣

SkyTab's profile picture
SkyTab1 year ago

Switch to SkyTab and get $5,000! A modern and sleek POS system with commercial-grade durability. 💪 ✅ $0 upfront costs ✅ Best in-class POS ✅ Local service & 24/7 support ✅ And much more! Make the switch today:

Umesh's profile picture
Umesh1 year ago

This is improving so fast that I don't want to speak myself anymore. Just use this and get done 🤖

Fronesis's profile picture
Fronesis1 year ago

Thank you for your work and for sharing insights! 🙌 Advancements like OuteTTS v0.2 showcase the rapid evolution of AI and its potential to empower global communities. 🚀 The future of #AI is bright, and collaborative innovation is key to unlocking its full potential!

Digital Doctor's profile picture
Digital Doctor1 year ago

Are you saying you can voice CLONE on a R-Pi? Is that what you're saying????

斎藤ただし, Tadashi Saito's profile picture
斎藤ただし, Tadashi Saito1 year ago

The font of Japanese characters is wrong, it's for (maybe) Chinese. I hope you'll pay attention and respect to each of them when you are working for multilingual/multicultural things. (like your TTS engine itself does. Brilliant quality✨️)

Ahmed Mansour's profile picture
Ahmed Mansour1 year ago

I tried to run it on HF. average inference time for 200 chars is >1 hour running on CPU. Why is this model so heavy?

Related Videos