Loading video...

Video Failed to Load

Go Home

We just solved text-to-speech AI. This model can simulate perfect emotion, screaming and show genuine alarm. — clearly beats 11 labs and Sesame — it’s only 1.6B params — streams realtime on 1 GPU — made by a 1.5 person team in Korea!! It's called Dia by Nari Labs.

710,352 views • 1 year ago •via X (Twitter)

11 Comments

Deedy's profile picture
Deedy1 year ago

Source:

Deedy's profile picture
Deedy1 year ago

The future is about to look really weird. Audio may have just crossed the uncanny valley (like parts of text and Ike have) into most-humans-wont-know-this-is-AI territory

MightyBot's profile picture
MightyBot1 year ago

🧠 Unified Search. Smarter Meetings. Effortless CRM. MightyBot is your AI agent platform for seamless workflows—record meetings, automate CRM updates, and find answers across apps in seconds. 🌟 Focus on what matters. We'll handle the grind.

Yuchen Jin's profile picture
Yuchen Jin1 year ago

what is the 0.5 person in the 1.5 person team? 😂

Deedy's profile picture
Deedy1 year ago

Part time research engineer!

Mudit Juneja's profile picture
Mudit Juneja1 year ago

Who are we here? Are you tied to this project?

Deedy's profile picture
Deedy1 year ago

We = humanity

Cr33d's profile picture
Cr33d1 year ago

1.5 people?! Did the 0.5 person just handle the screaming?

Rithik Chopra's profile picture
Rithik Chopra1 year ago

Damn that’s crazy!!!

Albert Sebastian's profile picture
Albert Sebastian1 year ago

whats your take on hume ai?

Cr33d's profile picture
Cr33d1 year ago

Perfect emotion? Finally, my toaster can apologize for burning my toast! 😂

Related Videos