Video yükleniyor...

Video Yüklenemedi

Ana Sayfaya Dön

introducing lipsync-1.9-beta, setting a new standard for lipsync quality it’s zero-shot—no training data needed. generate and edit natural speech seamlessly in any live action, animation, and AI-generated video this is the biggest update we’ve ever released, it’s the most natural lipsyncing model in the world 🔥 available now, try...

129,075 görüntüleme • 1 yıl önce •via X (Twitter)

11 Yorum

sync. profil fotoğrafı
sync.1 yıl önce

how do you design an intuitive product to expose a capability that’s never existed before? you iterate. try out the new platform here 👇 tell us what you like, don’t like. lot’s of exciting updates incoming :)

sync. profil fotoğrafı
sync.1 yıl önce

we’ve slowly rolled out early versions of this model to some of you even with a limited release in beta the response has been overwhelming :) across a small segment of users we’ve already seen this model become the most popular choice generating hundreds of hours in just a few days here’s a side by side comparison w/ 1.8.0 and 1.7.1:

sync. profil fotoğrafı
sync.1 yıl önce

Here are some examples showing how we’ve benchmarked against the latest in open source (lipsync-1.9-beta vs latentsync vs musetalk)

sync. profil fotoğrafı
sync.1 yıl önce

see some more examples of what lipsync-1.9 empowers you to do:

sync. profil fotoğrafı
sync.1 yıl önce

it works well across live action, animated, and even AI generated video (this video is completely ai generated)

sync. profil fotoğrafı
sync.1 yıl önce

video dubbing you can now follow the @lexfridman podcast with @ZelenskyyUa in fluent english as if it were his native tongue, no distracting mismatch between the audio and video

sync. profil fotoğrafı
sync.1 yıl önce

or you can replace dialogue in any scene (original vs resynced with two different dialogues)

sync. profil fotoğrafı
sync.1 yıl önce

this model is special our old pipelines accumulated errors as the video passed from one stage into another lipsync-1.9 is an end-to-end monolith that operates in a single shot. this helps it make very few mistakes across a wide range of videos it marks a profound shift in how we design our models. trained across millions of speakers + tens of thousands of hours of video, this new approach will pave the way to a future where any content can be made in a single take.

sync. profil fotoğrafı
sync.1 yıl önce

ps. we moved a large part of our company to be irl over the last few weeks to bring 1.9-beta into production — this time in India :) here’s a little bts into how we did:

sync. profil fotoğrafı
sync.1 yıl önce

check it out on yt:

AssemblyAI profil fotoğrafı
AssemblyAI1 yıl önce

Announcing: Our most advanced speech-to-text model goes beyond accuracy to capture the real-world complexity of human conversation and deliver reliable, source-of-truth audio data. Explore Universal-2 updates 👇

Benzer Videolar