Загрузка видео...

Не удалось загрузить видео

На главную

Smol TTS keeps getting better! Introducing OuteTTS v0.2 - 500M parameters, multilingual with voice cloning! 🔥 > Multilingual - English, Chinese, Korean & Japanese > Cross platform inference w/ llama.cpp > Zero-shot voice cloning > Trained on 5 Billion audio tokens > Qwen 2.5 0.5B LLM backbone > Trained...

44,566 просмотров • 1 год назад •via X (Twitter)

Комментарии: 11

Фото профиля Vaibhav (VB) Srivastav
Vaibhav (VB) Srivastav1 год назад

Check out the model weights and inference code base here:

Фото профиля Vaibhav (VB) Srivastav
Vaibhav (VB) Srivastav1 год назад

llama.cpp compatible GGUFs:

Фото профиля Vaibhav (VB) Srivastav
Vaibhav (VB) Srivastav1 год назад

OuteTTS GitHub:

Фото профиля Haorui He
Haorui He1 год назад

Big Congrats!!! Another SOTA TTS model trained on Emilia after F5-TTS & MaskGCT! Try out:

Фото профиля Tommy Falkowski
Tommy Falkowski1 год назад

Just tested it out and the quality is very good. More importantly, the fact that you can generate speaker profiles is awesome! Will test it out some more and add it to my growing list of supported tts engines in my app 🤣

Фото профиля SkyTab
SkyTab1 год назад

Switch to SkyTab and get $5,000! A modern and sleek POS system with commercial-grade durability. 💪 ✅ $0 upfront costs ✅ Best in-class POS ✅ Local service & 24/7 support ✅ And much more! Make the switch today:

Фото профиля Umesh
Umesh1 год назад

This is improving so fast that I don't want to speak myself anymore. Just use this and get done 🤖

Фото профиля Fronesis
Fronesis1 год назад

Thank you for your work and for sharing insights! 🙌 Advancements like OuteTTS v0.2 showcase the rapid evolution of AI and its potential to empower global communities. 🚀 The future of #AI is bright, and collaborative innovation is key to unlocking its full potential!

Фото профиля Digital Doctor
Digital Doctor1 год назад

Are you saying you can voice CLONE on a R-Pi? Is that what you're saying????

Фото профиля 斎藤ただし, Tadashi Saito
斎藤ただし, Tadashi Saito1 год назад

The font of Japanese characters is wrong, it's for (maybe) Chinese. I hope you'll pay attention and respect to each of them when you are working for multilingual/multicultural things. (like your TTS engine itself does. Brilliant quality✨️)

Фото профиля Ahmed Mansour
Ahmed Mansour1 год назад

I tried to run it on HF. average inference time for 200 chars is >1 hour running on CPU. Why is this model so heavy?

Похожие видео