Загрузка видео...

Не удалось загрузить видео

На главную

🚀 Introducing InterDyn — our newly accepted CVPR work that explores controllable synthesis of interactive dynamics! Building upon powerful video diffusion models, InterDyn infers future motion and interactions directly from an input image and a dynamic control signal (e.g., a moving hand mask). Check out how we push the...

44,898 просмотров • 1 год назад •via X (Twitter)

Комментарии: 10

Фото профиля Haven (Haiwen) Feng
Haven (Haiwen) Feng1 год назад

Dynamic Control & Beyond: Unlike prior methods that rely on explicit simulation or only static state transitions, InterDyn built a dynamic control branch on top of Stable Video Diffusion. We then fine-tune it to generate complex interactions (e.g. hand-object manipulations) and realistic multi-object collisions without heavy simulation computations. #StableDiffusion #StabilityAI 🧵2/6

Фото профиля Haven (Haiwen) Feng
Haven (Haiwen) Feng1 год назад

Intuitive Physics: At its core, InterDyn showcases the diffusion model’s “knowledge” of real-world physics and causal effects. By simply providing a moving object mask, the system implicitly models collisions, force propagation, and object dynamics—no 3D reconstruction or separate physics engine needed. 🧵3/6

Фото профиля Haven (Haiwen) Feng
Haven (Haiwen) Feng1 год назад

Superior Performance: We evaluate InterDyn on both synthetic (CLEVRER) and real-world datasets (Something-Something-v2), achieving up to 37.5% improvement on LPIPS and 77% on FVD over prior work. Whether it’s multi-object collisions or hand-object manipulations, InterDyn produces diverse and physically plausible videos. 🧵4/6

Фото профиля Haven (Haiwen) Feng
Haven (Haiwen) Feng1 год назад

Toward Interactive Video Generation: This new perspective merges intuitive physics with large-scale generative models, opening the door to controllable dynamics synthesis in complex scenes. We believe InterDyn lays the groundwork for future explorations in interactive video generation. Stay tuned for more! 🧵5/6

Фото профиля Haven (Haiwen) Feng
Haven (Haiwen) Feng1 год назад

This work was co-lead by me and our talented Master intern @rick_akker25502 (He's applying for PhD now, hire him!) together with amazing advisors, @Michael_J_Black , @dimtzionas and @vfabrevaya . More details & demos coming soon! See you in Nashville! #CVPR2025 #AI #ResearchPapers 🧵6/6

Фото профиля Adam
Adam1 год назад

Great work! I had a similar idea for hand-object interaction with video generation but with 3D conditioning

Фото профиля Robert Scoble
Robert Scoble1 год назад

Wow great work!

Фото профиля Kangfu Mei
Kangfu Mei1 год назад

Very nice and creative work!

Фото профиля Daniel Sungho Jung
Daniel Sungho Jung1 год назад

Interesting work! Were there any challenges during the research?

Фото профиля Erika S
Erika S1 год назад

InterDyn’s approach to controllable synthesis of interactive dynamics is fascinating. I’m excited to see how it advances intuitive physics with video generative models—truly pushing boundaries in AI and computer vision!

Похожие видео