Video yükleniyor...

Video Yüklenemedi

Bu video yüklenirken bir sorun oluştu. Bu geçici bir ağ sorunundan kaynaklanıyor olabilir veya video kullanılamıyor olabilir.

Ana Sayfaya Dön

introducing lipsync-1.9-beta, setting a new standard for lipsync quality it’s zero-shot—no training data needed. generate and edit natural speech seamlessly in any live action, animation, and AI-generated video this is the biggest update we’ve ever released, it’s the most natural lipsyncing model in the world 🔥 available now, try... show more

sync.

10,875 subscribers

129,075 görüntüleme • 1 yıl önce •via X (Twitter)

Bilim & Teknoloji Eğitim

Anya Rossi• Live Now

Private livecam show

11 Yorum

sync. profil fotoğrafı

sync.1 yıl önce

how do you design an intuitive product to expose a capability that’s never existed before? you iterate. try out the new platform here 👇 tell us what you like, don’t like. lot’s of exciting updates incoming :)

sync. profil fotoğrafı

sync.1 yıl önce

we’ve slowly rolled out early versions of this model to some of you even with a limited release in beta the response has been overwhelming :) across a small segment of users we’ve already seen this model become the most popular choice generating hundreds of hours in just a few days here’s a side by side comparison w/ 1.8.0 and 1.7.1:

sync. profil fotoğrafı

sync.1 yıl önce

Here are some examples showing how we’ve benchmarked against the latest in open source (lipsync-1.9-beta vs latentsync vs musetalk)

sync. profil fotoğrafı

sync.1 yıl önce

see some more examples of what lipsync-1.9 empowers you to do:

sync. profil fotoğrafı

sync.1 yıl önce

it works well across live action, animated, and even AI generated video (this video is completely ai generated)

sync. profil fotoğrafı

sync.1 yıl önce

video dubbing you can now follow the @lexfridman podcast with @ZelenskyyUa in fluent english as if it were his native tongue, no distracting mismatch between the audio and video

sync. profil fotoğrafı

sync.1 yıl önce

or you can replace dialogue in any scene (original vs resynced with two different dialogues)

sync. profil fotoğrafı

sync.1 yıl önce

this model is special our old pipelines accumulated errors as the video passed from one stage into another lipsync-1.9 is an end-to-end monolith that operates in a single shot. this helps it make very few mistakes across a wide range of videos it marks a profound shift in how we design our models. trained across millions of speakers + tens of thousands of hours of video, this new approach will pave the way to a future where any content can be made in a single take.

sync. profil fotoğrafı

sync.1 yıl önce

ps. we moved a large part of our company to be irl over the last few weeks to bring 1.9-beta into production — this time in India :) here’s a little bts into how we did:

sync. profil fotoğrafı

sync.1 yıl önce

check it out on yt:

AssemblyAI profil fotoğrafı

AssemblyAI1 yıl önce

Announcing: Our most advanced speech-to-text model goes beyond accuracy to capture the real-world complexity of human conversation and deliver reliable, source-of-truth audio data. Explore Universal-2 updates 👇

Benzer Videolar

today we're introducing lipsync-2, the world's first zero-shot lipsyncing model that preserves a speaker's unique style w/o additional training or fine-tuning lipsync-2 is a leap forward in realism, expressiveness, control, quality, and speed across live-action, animated, and AI-generated video lipsync-2 is rolling out in GA today 🧵

today we're introducing lipsync-2, the world's first zero-shot lipsyncing model that preserves a speaker's unique style w/o additional training or fine-tuning lipsync-2 is a leap forward in realism, expressiveness, control, quality, and speed across live-action, animated, and AI-generated video lipsync-2 is rolling out in GA today 🧵

sync. labs

89,604 görüntüleme • 1 yıl önce

we just shipped our most capable video-to-video lipsyncing model yet introducing lipsync-1.8.0 🪄 edit what anyone says in any video, no training required available now via API + v2 playground. check it out 👇

we just shipped our most capable video-to-video lipsyncing model yet introducing lipsync-1.8.0 🪄 edit what anyone says in any video, no training required available now via API + v2 playground. check it out 👇

Prady

39,262 görüntüleme • 1 yıl önce

we’ve built the most natural lipsync model in the world. again.

we’ve built the most natural lipsync model in the world. again.

sync. labs

323,963 görüntüleme • 1 yıl önce

the most natural lipsyncing tool just got better introducing lipsync-2-pro, a state-of-the-art video model to edit what anyone says in any video edit high resolution video while preserving every detail - from freckles to full beards - on any character

the most natural lipsyncing tool just got better introducing lipsync-2-pro, a state-of-the-art video model to edit what anyone says in any video edit high resolution video while preserving every detail - from freckles to full beards - on any character

sync. labs

181,232 görüntüleme • 9 ay önce

We just dropped a new SoTA lipsync model on fal: Hummingbird-0 Available now as a research preview, it's the most accurate zero-shot lipsync model we’ve tested, open or closed source.

We just dropped a new SoTA lipsync model on fal: Hummingbird-0 Available now as a research preview, it's the most accurate zero-shot lipsync model we’ve tested, open or closed source.

Tavus

460,422 görüntüleme • 1 yıl önce

in 7 days lipsync-2 is 80% of our usage — the fastest adoption we’ve ever seen. across live-action, AI generated, even high-end animations its seamless. and it’s only improving. incredibly excited to scale this up and launch our CPP

in 7 days lipsync-2 is 80% of our usage — the fastest adoption we’ve ever seen. across live-action, AI generated, even high-end animations its seamless. and it’s only improving. incredibly excited to scale this up and launch our CPP

Prady

11,962 görüntüleme • 1 yıl önce

Introducing Gemini 3.5 Flash Live Translate, our real time speech to speech translation model which supports more than 70 languages (both in and out), and is so natural. It is available in the Gemini API, AI Studio, & Google Translate right now + coming soon to Google Meet!!

Introducing Gemini 3.5 Flash Live Translate, our real time speech to speech translation model which supports more than 70 languages (both in and out), and is so natural. It is available in the Gemini API, AI Studio, & Google Translate right now + coming soon to Google Meet!!

Logan Kilpatrick

450,555 görüntüleme • 8 gün önce

Introducing Gemini 3.1 Flash TTS 🗣️, our latest text to speech model with scene direction, speaker level specificity, audio tags, more natural + expressive voices, and support for 70 different languages. Available via our new audio playground in AI Studio and in the Gemini API!

Introducing Gemini 3.1 Flash TTS 🗣️, our latest text to speech model with scene direction, speaker level specificity, audio tags, more natural + expressive voices, and support for 70 different languages. Available via our new audio playground in AI Studio and in the Gemini API!

Logan Kilpatrick

800,217 görüntüleme • 2 ay önce

Introducing Veo 3.1 in Hedra. The new standard for AI video is here. Generate stunningly photorealistic video for any scene you can imagine, powered by Google's most advanced model. Experience it now in Hedra.

Introducing Veo 3.1 in Hedra. The new standard for AI video is here. Generate stunningly photorealistic video for any scene you can imagine, powered by Google's most advanced model. Experience it now in Hedra.

Hedra

1,809,782 görüntüleme • 8 ay önce

Introducing TTS WebGPU: The first ever text-to-speech web app built with WebGPU acceleration! 🔥 High-quality and natural speech generation that runs 100% locally in your browser, powered by OuteTTS and Transformers.js.🤗 Try it out yourself! Demo + source code below 👇

Introducing TTS WebGPU: The first ever text-to-speech web app built with WebGPU acceleration! 🔥 High-quality and natural speech generation that runs 100% locally in your browser, powered by OuteTTS and Transformers.js.🤗 Try it out yourself! Demo + source code below 👇

Xenova

19,666 görüntüleme • 1 yıl önce

Introducing AI Avatar Videos 🚀 Hyperrealistic and expressive AI avatars that ACTUALLY WORK. Marketers trying to adopt AI for video are stuck in the loop: 1. Sees cool demo. 2. Tries the platform. 3. Output is far from useful. 4. Back to 1. NOT ANYMORE! Perfect lipsync. Expressive gestures. Hundreds of AI avatars you can choose from. Want to use your content to create endless variations? gotchu CreatorKit is now powered by our zero shot AI lipsync. WTF is zero shot AI, and why should you care? It means no more training videos. Just lipsync. It means no more minimum length. 5 sec vids work. It means no more hidden costs. Both time and $. If you've tried any AI video platform, you’ve been through all the above. Whether you’re cloning yourself for social or turning UGC footage into fresh ads, IT WORKS. Ready to see for yourself? Try CreatorKit and let me know how it goes!!

Kevin Natanzon

72,587 görüntüleme • 1 yıl önce

Dreamina Seedance 2.0 is LIVE on CapCut app, desktop & web, starting gradually in Indonesia, Philippines, Thailand, Vietnam, Malaysia, Brazil and Mexico with expansion over time. Generate and edit with industry-leading quality in one seamless workflow. Built to unlock new possibilities in visual storytelling with CapCut: - SOTA long-form coherence, up to 15s videos with multi-shot storytelling and exceptional text prompt adherence - Built-in dialogue, lipsync, and immersive spatial sound - Multimodal reference for greater creative control and precision Find Dreamina Seedance 2.0 in the following CapCut features: - Quick try: AI Lab & AI Generator (app, v17.1.0) - Generate + edit workflow: Media → AI Video (app & desktop) - AI-native workflow with omni reference: Video Studio (our latest canvas-based ai production workspace built for everyone from beginners to pro, access via CapCut web)

Dreamina Seedance 2.0 is LIVE on CapCut app, desktop & web, starting gradually in Indonesia, Philippines, Thailand, Vietnam, Malaysia, Brazil and Mexico with expansion over time. Generate and edit with industry-leading quality in one seamless workflow. Built to unlock new possibilities in visual storytelling with CapCut: - SOTA long-form coherence, up to 15s videos with multi-shot storytelling and exceptional text prompt adherence - Built-in dialogue, lipsync, and immersive spatial sound - Multimodal reference for greater creative control and precision Find Dreamina Seedance 2.0 in the following CapCut features: - Quick try: AI Lab & AI Generator (app, v17.1.0) - Generate + edit workflow: Media → AI Video (app & desktop) - AI-native workflow with omni reference: Video Studio (our latest canvas-based ai production workspace built for everyone from beginners to pro, access via CapCut web)

CapCut

7,170,307 görüntüleme • 2 ay önce

Speech translation has been one of the longest-running ML efforts at Google, and we’ve come a long way. Gemini 3.5 Live Translate is our latest speech-to-speech model, supporting 70+ languages. It enables more natural conversations across languages in everyday products and apps. Here’s an example of how partners at Grab are helping connect travelers with drivers. 🚗 Rolling out in Google Translate and via the Live API in Google AI Studio.

Speech translation has been one of the longest-running ML efforts at Google, and we’ve come a long way. Gemini 3.5 Live Translate is our latest speech-to-speech model, supporting 70+ languages. It enables more natural conversations across languages in everyday products and apps. Here’s an example of how partners at Grab are helping connect travelers with drivers. 🚗 Rolling out in Google Translate and via the Live API in Google AI Studio.

Jeff Dean

34,407 görüntüleme • 8 gün önce

ANNOUNCING: THE NEW ARKHAM API Today we’re unveiling our full-fledged Intel API based on enhancements we’ve made throughout our Pilot Program. The new Arkham API is the most advanced in the world for on-chain data and analytics.

ANNOUNCING: THE NEW ARKHAM API Today we’re unveiling our full-fledged Intel API based on enhancements we’ve made throughout our Pilot Program. The new Arkham API is the most advanced in the world for on-chain data and analytics.

Arkham

274,055 görüntüleme • 4 ay önce

Video Generation is LIVE on imgnAI Discord and Web in Beta! You can now bring your AI-generated images to life with short videos - right inside our Discord bot and web app. This is just the beginning of a new era in creative motion. 👇 🧵

Video Generation is LIVE on imgnAI Discord and Web in Beta! You can now bring your AI-generated images to life with short videos - right inside our Discord bot and web app. This is just the beginning of a new era in creative motion. 👇 🧵

imgnAI by IMGN Labs

19,496 görüntüleme • 1 yıl önce

Introducing Arcana: AI Voices with Vibes 🔮 We just launched the most realistic spoken language (TTS) model like ever! At Rime, we're dedicated to capturing the authentic nuances of real human speech, accents, laughter, sighs, and everything in between. Arcana also lets developers generate infinite voices just by providing a description or a fictional name. Live on our API and dashboard from day one. No waitlists, no barriers. Ready for building! Perfect for: ✨Relatable business voice agents ✨Fully immersive chatbots ✨Creative and dynamic storytelling ✨Natural multilingual conversations Try chatting with Arcana on our homepage now. Links below ⬇️

Introducing Arcana: AI Voices with Vibes 🔮 We just launched the most realistic spoken language (TTS) model like ever! At Rime, we're dedicated to capturing the authentic nuances of real human speech, accents, laughter, sighs, and everything in between. Arcana also lets developers generate infinite voices just by providing a description or a fictional name. Live on our API and dashboard from day one. No waitlists, no barriers. Ready for building! Perfect for: ✨Relatable business voice agents ✨Fully immersive chatbots ✨Creative and dynamic storytelling ✨Natural multilingual conversations Try chatting with Arcana on our homepage now. Links below ⬇️

Rime

48,135 görüntüleme • 1 yıl önce

good music makes all the difference and Mureka just dropped their new V7 model for AI-generated songs way more natural and on-point, and TTS to convert any text into rich, emotive speech! examples + everything you need to know in this thread 👇

good music makes all the difference and Mureka just dropped their new V7 model for AI-generated songs way more natural and on-point, and TTS to convert any text into rich, emotive speech! examples + everything you need to know in this thread 👇

TechHalla

180,180 görüntüleme • 11 ay önce

A humanoid robot policy trained solely on synthetic data generated by a world model. Research Scientist Joel Jang presents NVIDIA's DreamGen pipeline: ⦿ Post-train the world model Cosmos-Predict2 with a small set of real teleoperation demos. ⦿ Prompt the world model to generate synthetic video data with verbs and scenarios not used in the world model’s post-training. ⦿ Auto-label synthetic video data with action sequences. ⦿ Train robot policies using only synthetic data. That's it. Deploy zero-shot to a real humanoid robot.

A humanoid robot policy trained solely on synthetic data generated by a world model. Research Scientist Joel Jang presents NVIDIA's DreamGen pipeline: ⦿ Post-train the world model Cosmos-Predict2 with a small set of real teleoperation demos. ⦿ Prompt the world model to generate synthetic video data with verbs and scenarios not used in the world model’s post-training. ⦿ Auto-label synthetic video data with action sequences. ⦿ Train robot policies using only synthetic data. That's it. Deploy zero-shot to a real humanoid robot.

The Humanoid Hub

20,968 görüntüleme • 11 ay önce

$AMZN CEO: “There is a power shortage in the US and the world.” For new data centers, the most optimistic grid connection wait time is 3 years as power transformer lead times stretch to 12-24 months. So, most of the developers and hyperscalers are resorting to natural gas as it’s the fastest option. But even then, permitting is still an issue and risks delays and cancellations as is the case for $NBIS New Jersey data center. Power is now a bigger bottleneck for scaling AI than it’s ever been.

$AMZN CEO: “There is a power shortage in the US and the world.” For new data centers, the most optimistic grid connection wait time is 3 years as power transformer lead times stretch to 12-24 months. So, most of the developers and hyperscalers are resorting to natural gas as it’s the fastest option. But even then, permitting is still an issue and risks delays and cancellations as is the case for $NBIS New Jersey data center. Power is now a bigger bottleneck for scaling AI than it’s ever been.

Oguz Erkan

293,583 görüntüleme • 1 ay önce