Loading video...

Video Failed to Load

There was a problem loading this video. This could be due to a temporary network issue or the video might be unavailable.

Microsoft just dropped VASA-1. This AI can make single image sing and talk from audio reference expressively. Similar to EMO from Alibaba 10 wild examples: 1. Mona Lisa rapping Paparazzi

Min Choi

376,046 subscribers

7,298,891 views • 2 years ago •via X (Twitter)

Science & Technology Arts Education

Anya Rossi• Live Now

Private livecam show

13 Comments

Min Choi2 years ago

2. Realism and liveliness - example 1

Min Choi2 years ago

3. Realism and liveliness - example 2

Min Choi2 years ago

4. Out-of-distribution generalization - singing audios

Min Choi2 years ago

5. Controllability of generation 1 Example of eye gaze direction and head distance, and emotion offsets

Min Choi2 years ago

6. Controllability of generation 2 Example of different emotion offsets

Min Choi2 years ago

7. Power of disentanglement Example of same motion sequence with different photos

Min Choi2 years ago

8. Power of disentanglement Pose and expression editing

Min Choi2 years ago

9. Out-of-distribution generalization - singing audios

Min Choi2 years ago

10. Realism and liveliness - example 2

Min Choi2 years ago

READ MORE: Official Microsoft Research blog at

Min Choi2 years ago

If you enjoyed this thread, Follow me @minchoi and please Bookmark, Like, Comment & Repost the first Post below to share with your friends:

Min Choi2 years ago

Also check out wild new AI Music Videos 👇

Min Choi2 years ago

Also check out my series "AI will disrupt Hollywood (Part 36)" 👇

Related Videos

ByteDance (TikTok) announced Loopy recently. AI that brings any single image to life, making it sing and talk with facial expressions from an audio reference. 🤯 Similar to EMO (Alibaba) and VASA-1 (Microsoft) 10 wild examples: 1. Leonardo DiCaprio singing a Chinese song

ByteDance (TikTok) announced Loopy recently. AI that brings any single image to life, making it sing and talk with facial expressions from an audio reference. 🤯 Similar to EMO (Alibaba) and VASA-1 (Microsoft) 10 wild examples: 1. Leonardo DiCaprio singing a Chinese song

Min Choi

298,932 views • 1 year ago

This is mind blowing. This AI can make single image sing, talk, and rap from any audio file expressively! 🤯 Introducing EMO: Emote Portrait Alive by Alibaba. 10 wild examples: 🧵👇 1. AI Lady from Sora singing Dua Lipa

This is mind blowing. This AI can make single image sing, talk, and rap from any audio file expressively! 🤯 Introducing EMO: Emote Portrait Alive by Alibaba. 10 wild examples: 🧵👇 1. AI Lady from Sora singing Dua Lipa

Min Choi

1,494,087 views • 2 years ago

Hedra just dropped Character-1. And people can't stop making any person in image sing and talk from audio with this AI. 10 wild examples: 1. Fake female Elon rapping Paparazzi

Hedra just dropped Character-1. And people can't stop making any person in image sing and talk from audio with this AI. 10 wild examples: 1. Fake female Elon rapping Paparazzi

Min Choi

355,747 views • 2 years ago

Apparate Labs launched PROTEUS, a new real-time AI video generation model. It creates realistic avatars and lip-syncs from a single reference image, similar to VASA-1, but it's completely real-time.

Apparate Labs launched PROTEUS, a new real-time AI video generation model. It creates realistic avatars and lip-syncs from a single reference image, similar to VASA-1, but it's completely real-time.

Rowan Cheung

13,253 views • 2 years ago

Google DeepMind just dropped Genie 2. AI can now create diverse, interactive 3D worlds from a single image or text. Gaming will never be the same. 10 wild examples: 1. Long video generation on the fly

Google DeepMind just dropped Genie 2. AI can now create diverse, interactive 3D worlds from a single image or text. Gaming will never be the same. 10 wild examples: 1. Long video generation on the fly

Min Choi

373,295 views • 1 year ago

Udio just dropped Audio Prompting, and it's mind blowing. People can "Upload" their own music/sound and it will extend it. 10 wild examples: 1. udio

Udio just dropped Audio Prompting, and it's mind blowing. People can "Upload" their own music/sound and it will extend it. 10 wild examples: 1. udio

Min Choi

244,291 views • 2 years ago

Meta just announced MoCha This AI can create full movie-quality talking & singing characters from just speech & text. 10 wild examples: 1. Talking Characters

Meta just announced MoCha This AI can create full movie-quality talking & singing characters from just speech & text. 10 wild examples: 1. Talking Characters

Min Choi

420,113 views • 1 year ago

This is wild. OpenAI just dropped ChatGPT-4o and it will completely change the AI assistant game. 10 wild examples: 1. Visual assistant in real-time

This is wild. OpenAI just dropped ChatGPT-4o and it will completely change the AI assistant game. 10 wild examples: 1. Visual assistant in real-time

Min Choi

12,580,102 views • 2 years ago

Google DeepMind team dropped more wild AI Agent videos. 3 examples: 1. Identify famous face and facts from drawings

Google DeepMind team dropped more wild AI Agent videos. 3 examples: 1. Identify famous face and facts from drawings

Min Choi

349,742 views • 2 years ago

This is peak... Google just unveiled Genie 3 This AI generates photorealistic & 3D worlds from a text prompt and image... that you can explore in real-time Clearing a path towards AGI 10 wild examples + how to try below 1. Control a shiny marble

This is peak... Google just unveiled Genie 3 This AI generates photorealistic & 3D worlds from a text prompt and image... that you can explore in real-time Clearing a path towards AGI 10 wild examples + how to try below 1. Control a shiny marble

Linus ✦ Ekenstam

44,603 views • 5 months ago

mona lisa from this angle was a 1+9/10🔥

mona lisa from this angle was a 1+9/10🔥

mimi’s calico⁷ ͤ ͨ ͪ ͦ

49,317 views • 1 year ago

This is wild. Replit Agent just dropped, and it's about to completely change the app development game... from idea to deploy, even from your mobile. 🤯 10 wild examples:

This is wild. Replit Agent just dropped, and it's about to completely change the app development game... from idea to deploy, even from your mobile. 🤯 10 wild examples:

Min Choi

369,659 views • 1 year ago

Microsoft just dropped VibeVoice (open-source) This AI turn text into a 90-min, up to 4-voice podcast. With natural pauses, emotion, even singing. 6 wild examples + code: 1. Spontaneous singing

Microsoft just dropped VibeVoice (open-source) This AI turn text into a 90-min, up to 4-voice podcast. With natural pauses, emotion, even singing. 6 wild examples + code: 1. Spontaneous singing

Min Choi

94,614 views • 10 months ago

This is wild. Luma AI just dropped Dream Machine that generates AI video from text and image. Unlike Sora, it's open to public today. The quality is insane. 1. Kaku Drop 架空飴

This is wild. Luma AI just dropped Dream Machine that generates AI video from text and image. Unlike Sora, it's open to public today. The quality is insane. 1. Kaku Drop 架空飴

Min Choi

757,400 views • 2 years ago

OK, this is insane.. Alibaba just dropped 4 new AI models at Apsara 2025, and they’re wild: → a 1 trillion parameter LLM → a vision model that codes from images → an omni-model for text/audio/video → and a new Wan 2.5 preview for video + audio gen more details below:👇

OK, this is insane.. Alibaba just dropped 4 new AI models at Apsara 2025, and they’re wild: → a 1 trillion parameter LLM → a vision model that codes from images → an omni-model for text/audio/video → and a new Wan 2.5 preview for video + audio gen more details below:👇

Hamza Khalid

26,707 views • 9 months ago

Microsoft just dropped Muse. This AI can generate minutes of smooth gameplay from just 1 second of footage & controls. And it's open source! The future of gaming is about to change forever.

Microsoft just dropped Muse. This AI can generate minutes of smooth gameplay from just 1 second of footage & controls. And it's open source! The future of gaming is about to change forever.

Min Choi

102,961 views • 1 year ago

Ultra realistic AI-video from a photo This is VASA-1 from Microsoft research The improvements in quality we’re getting between each new release is incredible. Links below

Ultra realistic AI-video from a photo This is VASA-1 from Microsoft research The improvements in quality we’re getting between each new release is incredible. Links below

Linus ✦ Ekenstam

483,041 views • 2 years ago

This is wild. China's Alibaba just dropped Live Avatar. This AI turns any voice into a realtime, talking avatar with infinite length at 20 FPS. 10 wild demos:👇 1. Ilya interview that never happened

Min Choi

220,656 views • 6 months ago

This is totally wild Google Veo 3 dropped just 90 hours ago, and people literally can’t stop creating with it No one can believe this is 100% AI 10 new trending examples: 1. A car show that never happened all AI-generated.

This is totally wild Google Veo 3 dropped just 90 hours ago, and people literally can’t stop creating with it No one can believe this is 100% AI 10 new trending examples: 1. A car show that never happened all AI-generated.

ZOYA ✪

4,844,955 views • 1 year ago