Loading video...

Video Failed to Load

There was a problem loading this video. This could be due to a temporary network issue or the video might be unavailable.

We just solved text-to-speech AI. This model can simulate perfect emotion, screaming and show genuine alarm. — clearly beats 11 labs and Sesame — it’s only 1.6B params — streams realtime on 1 GPU — made by a 1.5 person team in Korea!! It's called Dia by Nari Labs.

Deedy

245,864 subscribers

710,352 views • 1 year ago •via X (Twitter)

News & Politics Science & Technology Education

Anya Rossi• Live Now

Private livecam show

11 Comments

Deedy1 year ago

Source:

Deedy1 year ago

The future is about to look really weird. Audio may have just crossed the uncanny valley (like parts of text and Ike have) into most-humans-wont-know-this-is-AI territory

MightyBot1 year ago

🧠 Unified Search. Smarter Meetings. Effortless CRM. MightyBot is your AI agent platform for seamless workflows—record meetings, automate CRM updates, and find answers across apps in seconds. 🌟 Focus on what matters. We'll handle the grind.

Yuchen Jin1 year ago

what is the 0.5 person in the 1.5 person team? 😂

Deedy1 year ago

Part time research engineer!

Mudit Juneja1 year ago

Who are we here? Are you tied to this project?

Deedy1 year ago

We = humanity

Cr33d1 year ago

1.5 people?! Did the 0.5 person just handle the screaming?

Rithik Chopra1 year ago

Damn that’s crazy!!!

Albert Sebastian1 year ago

whats your take on hume ai?

Cr33d1 year ago

Perfect emotion? Finally, my toaster can apologize for burning my toast! 😂

Related Videos

🎧 Dia is a 1.6B parameter text to speech model created by Nari Labs. (Apache 2.0) 🔊 Jupyter Notebook 🥳 Thanks to Toby Kim ❤ Nari Labs ❤ 🌐page: 🧬code: 🍊jupyte:

🎧 Dia is a 1.6B parameter text to speech model created by Nari Labs. (Apache 2.0) 🔊 Jupyter Notebook 🥳 Thanks to Toby Kim ❤ Nari Labs ❤ 🌐page: 🧬code: 🍊jupyte:

camenduru

11,335 views • 1 year ago

HOLY CRAP, a new super tiny 1.6B param voice model just dropped that seems to.. outperform 11labs!? 😵‍💫 From Nari-labs, Dia is an Apache 2.0 voice model, that can generate laughs, sniffs and emotions, copy an existing voice and is effectively real time on larger GPUs:

HOLY CRAP, a new super tiny 1.6B param voice model just dropped that seems to.. outperform 11labs!? 😵‍💫 From Nari-labs, Dia is an Apache 2.0 voice model, that can generate laughs, sniffs and emotions, copy an existing voice and is effectively real time on larger GPUs:

Alex Volkov

525,803 views • 1 year ago

How did a tiny, scrappy team build one of the most powerful AI voice models? In a deep dive with Sesame CTO Ankit Kumar and a16z's Anjney Midha, we explore how Sesame is pushing the boundaries of AI conversation, why it open-sourced its speech generation model, and the power of small teams to outdo much larger AI labs on product focus. A part of their secret: a relentless focus on real-time, natural conversations over raw intelligence, and a deep commitment to voice, personality, and user experience. By opening up its speech generation model, Sesame is paving the way for even more breakthroughs in AI-native conversation 👇

How did a tiny, scrappy team build one of the most powerful AI voice models? In a deep dive with Sesame CTO Ankit Kumar and a16z's Anjney Midha, we explore how Sesame is pushing the boundaries of AI conversation, why it open-sourced its speech generation model, and the power of small teams to outdo much larger AI labs on product focus. A part of their secret: a relentless focus on real-time, natural conversations over raw intelligence, and a deep commitment to voice, personality, and user experience. By opening up its speech generation model, Sesame is paving the way for even more breakthroughs in AI-native conversation 👇

a16z

29,880 views • 1 year ago

I'm playing in Cristian Peñas ░░░░░░░░'s experimental AI playground. I generated this 3D skybox with a text prompt in the instaVerse Alpha testing powered by Blockade Labs. It's free and a lot of fun...

I'm playing in Cristian Peñas ░░░░░░░░'s experimental AI playground. I generated this 3D skybox with a text prompt in the instaVerse Alpha testing powered by Blockade Labs. It's free and a lot of fun...

Heather Cooper

10,013 views • 3 years ago

[#ONEPIECE] SHAMROCK VS GUNKO ANIMATION (45s of fight) ‼️ this video is a fan fiction and was made using SEEDANCE 2.0 on CAPCUT using Ai Labs Studio on MOBILE❤️ANYONE CAN NOW CREATE IT'S OWN ANIME 🚨 show your love in the comments / by liking #ONEPIECE1184 #OPSPOILERS #ANIME

[#ONEPIECE] SHAMROCK VS GUNKO ANIMATION (45s of fight) ‼️ this video is a fan fiction and was made using SEEDANCE 2.0 on CAPCUT using Ai Labs Studio on MOBILE❤️ANYONE CAN NOW CREATE IT'S OWN ANIME 🚨 show your love in the comments / by liking #ONEPIECE1184 #OPSPOILERS #ANIME

Fotachu - AR GUY

17,593 views • 1 month ago

1/ Tribute Labs is excited to introduce X0X by KiM ASENDORF to the world. A generative algorithm with 1,000 on-chain, real-time animations. Commissioned by Tribute Labs and curated by 王富贵 (soon to be @xxdao) Every member in the Tribute Labs network is gifted one mint.

1/ Tribute Labs is excited to introduce X0X by KiM ASENDORF to the world. A generative algorithm with 1,000 on-chain, real-time animations. Commissioned by Tribute Labs and curated by 王富贵 (soon to be @xxdao) Every member in the Tribute Labs network is gifted one mint.

Tribute Labs

36,241 views • 2 years ago

GPU is the KEY and a key🔑to unlocking the AI world. And we’re the ones holding it Avalon Labs I know everyone’s tired of hearing talk — it’s time to see action. So follow me and watch the Avalon Labs GPU Installation Day 👇🏻 BNB Chain CZ 🔶 BNB

GPU is the KEY and a key🔑to unlocking the AI world. And we’re the ones holding it Avalon Labs I know everyone’s tired of hearing talk — it’s time to see action. So follow me and watch the Avalon Labs GPU Installation Day 👇🏻 BNB Chain CZ 🔶 BNB

0xVE AvalonLabs 🎩🔮

37,819 views • 8 months ago

Inception Labs just killed the transformer. They released Mercury 2, the world's first "diffusion" reasoning model. It's fast, and it uses a completely new model architecture... just watch this 11 min video to find out more:

Inception Labs just killed the transformer. They released Mercury 2, the world's first "diffusion" reasoning model. It's fast, and it uses a completely new model architecture... just watch this 11 min video to find out more:

David Ondrej

45,692 views • 3 months ago

New course 🚨 Learn to build AI apps that can process very long documents with the Jamba model in this course, built in partnership with AI21 Labs and taught by Chen Wang and Chen Almagor. Learn more and join for free:

New course 🚨 Learn to build AI apps that can process very long documents with the Jamba model in this course, built in partnership with AI21 Labs and taught by Chen Wang and Chen Almagor. Learn more and join for free:

DeepLearning.AI

10,882 views • 1 year ago

Wild. Kimi K2 Thinking just released and it's insane. It's an AI model that can run by itself for hours on end and make HUNDREDS of tool calls It's the 1st model I think that can replace humans In this video I show why it's so special and how to use it to build your first app

Wild. Kimi K2 Thinking just released and it's insane. It's an AI model that can run by itself for hours on end and make HUNDREDS of tool calls It's the 1st model I think that can replace humans In this video I show why it's so special and how to use it to build your first app

Alex Finn

70,013 views • 7 months ago

We can save innocent animals and taxpayer money by ending cruel and unnecessary experiments in U.S. labs. The time to end this antiquated and inhumane practice is NOW!

We can save innocent animals and taxpayer money by ending cruel and unnecessary experiments in U.S. labs. The time to end this antiquated and inhumane practice is NOW!

Office of Rep. Nicole Malliotakis

16,744 views • 1 year ago

ComfyUI native 3d node supports 3dgs and ply format. Integrated spark , developed by World Labs. Next we have ability to connect World labs Marble, or Tencent HunyuanWorld, or any other world model in ComfyUI. #ComfyUI

ComfyUI native 3d node supports 3dgs and ply format. Integrated spark , developed by World Labs. Next we have ability to connect World labs Marble, or Tencent HunyuanWorld, or any other world model in ComfyUI. #ComfyUI

jtydhr88

12,111 views • 6 months ago

We're excited to bring Sesame Labs out of stealth and announce our seed funding of $4.5M led by Wing VC and Our mission is to "enable people and dApps to engage with trust in a decentralized world". How did we get here?

We're excited to bring Sesame Labs out of stealth and announce our seed funding of $4.5M led by Wing VC and Our mission is to "enable people and dApps to engage with trust in a decentralized world". How did we get here?

Vinay Jain

101,214 views • 3 years ago

JUST IN: BLACK-OWNED STARTUP AI LAB NIGGACHAIN LABS UNVEILS THEIR LATEST AI MODEL. OUTPERFORMS LATEST OpenAI AND DeepSeek MODELS

JUST IN: BLACK-OWNED STARTUP AI LAB NIGGACHAIN LABS UNVEILS THEIR LATEST AI MODEL. OUTPERFORMS LATEST OpenAI AND DeepSeek MODELS

Niggachain AI Layer 2 🧪⛓️

21,024 views • 1 year ago

This viral video racked up over 100k views on TikTok within 24 hours, from a totally unknown account. It's 100% AI, and costs less than $5 to create - how? It was made with imgnAI by IMGN Labs 👇 (paid partnership)

This viral video racked up over 100k views on TikTok within 24 hours, from a totally unknown account. It's 100% AI, and costs less than $5 to create - how? It was made with imgnAI by IMGN Labs 👇 (paid partnership)

Chase Dimond | Email Marketing Nerd 📧

31,041 views • 8 months ago

JUST IN: Google releases Gemini 1.5, a powerful MoE model. It's a huge breakthrough. The model has the longest context window ever seen: 1 million tokens. It can process 1 hour of video, 11 hours of audio, 30,000 lines of code, or 700,000 words in a single prompt. When tested on text, code, image, audio and video evaluations, 1.5 Pro outperforms 1.0 Pro on 87% of the benchmarks used for developing LLMs. You can can sign up in AI Studio to try it out.

JUST IN: Google releases Gemini 1.5, a powerful MoE model. It's a huge breakthrough. The model has the longest context window ever seen: 1 million tokens. It can process 1 hour of video, 11 hours of audio, 30,000 lines of code, or 700,000 words in a single prompt. When tested on text, code, image, audio and video evaluations, 1.5 Pro outperforms 1.0 Pro on 87% of the benchmarks used for developing LLMs. You can can sign up in AI Studio to try it out.

Lior Alexander

83,409 views • 2 years ago

Wild. AI avatars can now interpret text automatically. Synthesia just released a new diffusion model called EXPRESS-1, which allows AI avatars to understand a script and adjust performance on the spot. Direct link to access and try it below.

Wild. AI avatars can now interpret text automatically. Synthesia just released a new diffusion model called EXPRESS-1, which allows AI avatars to understand a script and adjust performance on the spot. Direct link to access and try it below.

Rowan Cheung

117,522 views • 2 years ago

Introducing Claude Design by Anthropic Labs: make prototypes, slides, and one-pagers by talking to Claude. Powered by Claude Opus 4.7, our most capable vision model. Available in research preview on the Pro, Max, Team, and Enterprise plans, rolling out throughout the day.

Introducing Claude Design by Anthropic Labs: make prototypes, slides, and one-pagers by talking to Claude. Powered by Claude Opus 4.7, our most capable vision model. Available in research preview on the Pro, Max, Team, and Enterprise plans, rolling out throughout the day.

Claude

64,023,957 views • 2 months ago

"The world's hardest sudoku" solved by 12M params RWKV-6 after 4M tokens CoT 🙂 code & model: Note the model was only trained with ctx8192, so it's extrapolating 500x to ctx4M #RWKV #RNN

"The world's hardest sudoku" solved by 12M params RWKV-6 after 4M tokens CoT 🙂 code & model: Note the model was only trained with ctx8192, so it's extrapolating 500x to ctx4M #RWKV #RNN

BlinkDL

80,870 views • 1 year ago