Video wird geladen...

Video konnte nicht geladen werden

Zur Startseite

Can we learn a 3D world model that predicts object dynamics directly from videos? Introducing Particle-Grid Neural Dynamics: a learning-based simulator for deformable objects that trains from real-world videos. Website: ArXiv: Code: Demo: To appear at #RSS2025

45,946 Aufrufe • vor 11 Monaten •via X (Twitter)

10 Kommentare

Profilbild von Kaifeng Zhang
Kaifeng Zhangvor 11 Monaten

Modeling ropes, cloth, bags, etc. is hard because of their complex physics and partial observability. Classical simulators struggle to construct exact digital twins from real observations. We overcome these challenges by learning neural dynamics directly from videos.

Profilbild von Kaifeng Zhang
Kaifeng Zhangvor 11 Monaten

Our particle-based neural dynamics model represents objects as dense 3D particles and predicts their next-step velocities to simulate object dynamics. It features three stages: particle encoding, grid-velocity editing, and grid-to-particle velocity transfer.

Profilbild von Kaifeng Zhang
Kaifeng Zhangvor 11 Monaten

Trained with videos including robot–object interactions under self-supervion, PGND can model diverse deformable objects—including ropes, cloth, stuffed animals, and paper bags—using <20 minutes of data per object.

Profilbild von Kaifeng Zhang
Kaifeng Zhangvor 11 Monaten

PGND becomes a 3D action-conditioned video generator when 3D Gaussian Splatting is plugged in. It aligns better with ground truth, producing visually more realistic deformations than the baseline.

Profilbild von Kaifeng Zhang
Kaifeng Zhangvor 11 Monaten

PGND can also act as a photorealistic deformable-object simulator with a complete scan of the scene. Given only a static reconstruction, we simulate the segmented object’s motion with a sequence of robot actions (red arrows).

Profilbild von Kaifeng Zhang
Kaifeng Zhangvor 11 Monaten

Finally, PGND serves as a 3D world model within Model Predictive Control. It guides dual-arm cloth lifting, rope shaping, box closing, and plush-toy relocation, achieving fast convergence to target configurations.

Profilbild von Kaifeng Zhang
Kaifeng Zhangvor 11 Monaten

This work is a close collaboration between Columbia University @ColumbiaCompSci and University of Illinois Urbana-Champaign @siebelschool. Huge thanks to my co-authors: @YunzhuLiYZ, Kris Hauser, @BaoyuLi6 !

Profilbild von Carlos DP
Carlos DPvor 11 Monaten

I love this, and that you made a hf space demo

Profilbild von Hongyu Li
Hongyu Livor 11 Monaten

This is an exciting work. Congrats!!

Profilbild von Kaifeng Zhang
Kaifeng Zhangvor 11 Monaten

Thank you Hongyu!

Ähnliche Videos