Video wird geladen...

Video konnte nicht geladen werden

Zur Startseite

introducing lipsync-1.9-beta, setting a new standard for lipsync quality it’s zero-shot—no training data needed. generate and edit natural speech seamlessly in any live action, animation, and AI-generated video this is the biggest update we’ve ever released, it’s the most natural lipsyncing model in the world 🔥 available now, try...

129,075 Aufrufe • vor 1 Jahr •via X (Twitter)

11 Kommentare

Profilbild von sync.
sync.vor 1 Jahr

how do you design an intuitive product to expose a capability that’s never existed before? you iterate. try out the new platform here 👇 tell us what you like, don’t like. lot’s of exciting updates incoming :)

Profilbild von sync.
sync.vor 1 Jahr

we’ve slowly rolled out early versions of this model to some of you even with a limited release in beta the response has been overwhelming :) across a small segment of users we’ve already seen this model become the most popular choice generating hundreds of hours in just a few days here’s a side by side comparison w/ 1.8.0 and 1.7.1:

Profilbild von sync.
sync.vor 1 Jahr

Here are some examples showing how we’ve benchmarked against the latest in open source (lipsync-1.9-beta vs latentsync vs musetalk)

Profilbild von sync.
sync.vor 1 Jahr

see some more examples of what lipsync-1.9 empowers you to do:

Profilbild von sync.
sync.vor 1 Jahr

it works well across live action, animated, and even AI generated video (this video is completely ai generated)

Profilbild von sync.
sync.vor 1 Jahr

video dubbing you can now follow the @lexfridman podcast with @ZelenskyyUa in fluent english as if it were his native tongue, no distracting mismatch between the audio and video

Profilbild von sync.
sync.vor 1 Jahr

or you can replace dialogue in any scene (original vs resynced with two different dialogues)

Profilbild von sync.
sync.vor 1 Jahr

this model is special our old pipelines accumulated errors as the video passed from one stage into another lipsync-1.9 is an end-to-end monolith that operates in a single shot. this helps it make very few mistakes across a wide range of videos it marks a profound shift in how we design our models. trained across millions of speakers + tens of thousands of hours of video, this new approach will pave the way to a future where any content can be made in a single take.

Profilbild von sync.
sync.vor 1 Jahr

ps. we moved a large part of our company to be irl over the last few weeks to bring 1.9-beta into production — this time in India :) here’s a little bts into how we did:

Profilbild von sync.
sync.vor 1 Jahr

check it out on yt:

Profilbild von AssemblyAI
AssemblyAIvor 1 Jahr

Announcing: Our most advanced speech-to-text model goes beyond accuracy to capture the real-world complexity of human conversation and deliver reliable, source-of-truth audio data. Explore Universal-2 updates 👇

Ähnliche Videos