Video wird geladen...

Video konnte nicht geladen werden

Zur Startseite

“You can build a real relationship with someone using just your voice” Understanding the power of audio, JC built Playfriends into a 1-million-user voice platform where creators can monetise through voice alone. Now, with Beam, creators can receive virtual gift without even going live. Tune in to the discussion...

10,833 Aufrufe • vor 5 Monaten •via X (Twitter)

0 Kommentare

Keine Kommentare verfügbar

Kommentare vom Original-Post werden hier angezeigt

Ähnliche Videos

Voice used to be AI’s forgotten modality - now it's having its big moment: rapid innovation, big funding rounds, major agentic applications My conversation with Neil Zeghidour, top AI researcher in the field (Google DeepMind, Meta, kyutai) and now CEO of Gradium This is a reference episode on all things voice AI 🔥 00:00 Intro 01:21 Voice AI’s big moment, and why we’re still early 03:34 Why voice lagged behind text/image/video 06:06 The convergence era: transformers for every modality 07:40 Beyond Her: always-on assistants, wake words, voice-first devices 11:01 Voice vs text: where voice fits (even for coding) 12:56 Neil’s origin story: from finance to machine learning, with help from Yann LeCun and Soumith Chintala 18:35 Neural codecs (SoundStream): compression as the unlock 22:30 Kyutai: open research, small elite teams, moving fast 31:32 Why big labs haven’t “won” voice AI4 34:01 On-device voice: where it works, why compact models matter 46:37 The last mile: real-world robustness, pronunciation, uptime 41:35 Benchmarking voice: why metrics fail, how they actually test 47:03 Cascades vs speech-to-speech: trade-offs + what’s next 54:05 Hardest frontier: noisy rooms, factories, multi-speaker chaos 1:00:50 New languages + dialects: what transfers, what doesn’t 1:02:54 Hardware & compute: why voice isn’t a 10,000-GPU game 1:07:27 What data do you need to train voice models 1:09:02 Deepfakes + privacy: why watermarking isn’t a solution 1:12:30 Voice + vision: multimodality, screen awareness, video+audio 1:14:43 Voice cloning vs voice design: where the market goes 1:16:32 Paris/Europe AI: talent density, underdog energy, what’s next

Matt Turck

22,895 Aufrufe • vor 4 Monaten