Video yükleniyor...

Video Yüklenemedi

Ana Sayfaya Dön

🚀 Introducing InterDyn — our newly accepted CVPR work that explores controllable synthesis of interactive dynamics! Building upon powerful video diffusion models, InterDyn infers future motion and interactions directly from an input image and a dynamic control signal (e.g., a moving hand mask). Check out how we push the...

44,898 görüntüleme • 1 yıl önce •via X (Twitter)

10 Yorum

Haven (Haiwen) Feng profil fotoğrafı
Haven (Haiwen) Feng1 yıl önce

Dynamic Control & Beyond: Unlike prior methods that rely on explicit simulation or only static state transitions, InterDyn built a dynamic control branch on top of Stable Video Diffusion. We then fine-tune it to generate complex interactions (e.g. hand-object manipulations) and realistic multi-object collisions without heavy simulation computations. #StableDiffusion #StabilityAI 🧵2/6

Haven (Haiwen) Feng profil fotoğrafı
Haven (Haiwen) Feng1 yıl önce

Intuitive Physics: At its core, InterDyn showcases the diffusion model’s “knowledge” of real-world physics and causal effects. By simply providing a moving object mask, the system implicitly models collisions, force propagation, and object dynamics—no 3D reconstruction or separate physics engine needed. 🧵3/6

Haven (Haiwen) Feng profil fotoğrafı
Haven (Haiwen) Feng1 yıl önce

Superior Performance: We evaluate InterDyn on both synthetic (CLEVRER) and real-world datasets (Something-Something-v2), achieving up to 37.5% improvement on LPIPS and 77% on FVD over prior work. Whether it’s multi-object collisions or hand-object manipulations, InterDyn produces diverse and physically plausible videos. 🧵4/6

Haven (Haiwen) Feng profil fotoğrafı
Haven (Haiwen) Feng1 yıl önce

Toward Interactive Video Generation: This new perspective merges intuitive physics with large-scale generative models, opening the door to controllable dynamics synthesis in complex scenes. We believe InterDyn lays the groundwork for future explorations in interactive video generation. Stay tuned for more! 🧵5/6

Haven (Haiwen) Feng profil fotoğrafı
Haven (Haiwen) Feng1 yıl önce

This work was co-lead by me and our talented Master intern @rick_akker25502 (He's applying for PhD now, hire him!) together with amazing advisors, @Michael_J_Black , @dimtzionas and @vfabrevaya . More details & demos coming soon! See you in Nashville! #CVPR2025 #AI #ResearchPapers 🧵6/6

Adam profil fotoğrafı
Adam1 yıl önce

Great work! I had a similar idea for hand-object interaction with video generation but with 3D conditioning

Robert Scoble profil fotoğrafı
Robert Scoble1 yıl önce

Wow great work!

Kangfu Mei profil fotoğrafı
Kangfu Mei1 yıl önce

Very nice and creative work!

Daniel Sungho Jung profil fotoğrafı
Daniel Sungho Jung1 yıl önce

Interesting work! Were there any challenges during the research?

Erika S profil fotoğrafı
Erika S1 yıl önce

InterDyn’s approach to controllable synthesis of interactive dynamics is fascinating. I’m excited to see how it advances intuitive physics with video generative models—truly pushing boundaries in AI and computer vision!

Benzer Videolar