正在加载视频...

视频加载失败

🚀 Introducing InterDyn — our newly accepted CVPR work that explores controllable synthesis of interactive dynamics! Building upon powerful video diffusion models, InterDyn infers future motion and interactions directly from an input image and a dynamic control signal (e.g., a moving hand mask). Check out how we push the...

44,898 次观看 • 1 年前 •via X (Twitter)

10 条评论

Haven (Haiwen) Feng 的头像
Haven (Haiwen) Feng1 年前

Dynamic Control & Beyond: Unlike prior methods that rely on explicit simulation or only static state transitions, InterDyn built a dynamic control branch on top of Stable Video Diffusion. We then fine-tune it to generate complex interactions (e.g. hand-object manipulations) and realistic multi-object collisions without heavy simulation computations. #StableDiffusion #StabilityAI 🧵2/6

Haven (Haiwen) Feng 的头像
Haven (Haiwen) Feng1 年前

Intuitive Physics: At its core, InterDyn showcases the diffusion model’s “knowledge” of real-world physics and causal effects. By simply providing a moving object mask, the system implicitly models collisions, force propagation, and object dynamics—no 3D reconstruction or separate physics engine needed. 🧵3/6

Haven (Haiwen) Feng 的头像
Haven (Haiwen) Feng1 年前

Superior Performance: We evaluate InterDyn on both synthetic (CLEVRER) and real-world datasets (Something-Something-v2), achieving up to 37.5% improvement on LPIPS and 77% on FVD over prior work. Whether it’s multi-object collisions or hand-object manipulations, InterDyn produces diverse and physically plausible videos. 🧵4/6

Haven (Haiwen) Feng 的头像
Haven (Haiwen) Feng1 年前

Toward Interactive Video Generation: This new perspective merges intuitive physics with large-scale generative models, opening the door to controllable dynamics synthesis in complex scenes. We believe InterDyn lays the groundwork for future explorations in interactive video generation. Stay tuned for more! 🧵5/6

Haven (Haiwen) Feng 的头像
Haven (Haiwen) Feng1 年前

This work was co-lead by me and our talented Master intern @rick_akker25502 (He's applying for PhD now, hire him!) together with amazing advisors, @Michael_J_Black , @dimtzionas and @vfabrevaya . More details & demos coming soon! See you in Nashville! #CVPR2025 #AI #ResearchPapers 🧵6/6

Adam 的头像
Adam1 年前

Great work! I had a similar idea for hand-object interaction with video generation but with 3D conditioning

Robert Scoble 的头像
Robert Scoble1 年前

Wow great work!

Kangfu Mei 的头像
Kangfu Mei1 年前

Very nice and creative work!

Daniel Sungho Jung 的头像
Daniel Sungho Jung1 年前

Interesting work! Were there any challenges during the research?

Erika S 的头像
Erika S1 年前

InterDyn’s approach to controllable synthesis of interactive dynamics is fascinating. I’m excited to see how it advances intuitive physics with video generative models—truly pushing boundaries in AI and computer vision!

相关视频