Loading video...

Video Failed to Load

Go Home

Can we learn a 3D world model that predicts object dynamics directly from videos? Introducing Particle-Grid Neural Dynamics: a learning-based simulator for deformable objects that trains from real-world videos. Website: ArXiv: Code: Demo: To appear at #RSS2025

45,971 views • 1 year ago •via X (Twitter)

10 Comments

Kaifeng Zhang's profile picture
Kaifeng Zhang1 year ago

Modeling ropes, cloth, bags, etc. is hard because of their complex physics and partial observability. Classical simulators struggle to construct exact digital twins from real observations. We overcome these challenges by learning neural dynamics directly from videos.

Kaifeng Zhang's profile picture
Kaifeng Zhang1 year ago

Our particle-based neural dynamics model represents objects as dense 3D particles and predicts their next-step velocities to simulate object dynamics. It features three stages: particle encoding, grid-velocity editing, and grid-to-particle velocity transfer.

Kaifeng Zhang's profile picture
Kaifeng Zhang1 year ago

Trained with videos including robot–object interactions under self-supervion, PGND can model diverse deformable objects—including ropes, cloth, stuffed animals, and paper bags—using <20 minutes of data per object.

Kaifeng Zhang's profile picture
Kaifeng Zhang1 year ago

PGND becomes a 3D action-conditioned video generator when 3D Gaussian Splatting is plugged in. It aligns better with ground truth, producing visually more realistic deformations than the baseline.

Kaifeng Zhang's profile picture
Kaifeng Zhang1 year ago

PGND can also act as a photorealistic deformable-object simulator with a complete scan of the scene. Given only a static reconstruction, we simulate the segmented object’s motion with a sequence of robot actions (red arrows).

Kaifeng Zhang's profile picture
Kaifeng Zhang1 year ago

Finally, PGND serves as a 3D world model within Model Predictive Control. It guides dual-arm cloth lifting, rope shaping, box closing, and plush-toy relocation, achieving fast convergence to target configurations.

Kaifeng Zhang's profile picture
Kaifeng Zhang1 year ago

This work is a close collaboration between Columbia University @ColumbiaCompSci and University of Illinois Urbana-Champaign @siebelschool. Huge thanks to my co-authors: @YunzhuLiYZ, Kris Hauser, @BaoyuLi6 !

Carlos DP's profile picture
Carlos DP1 year ago

I love this, and that you made a hf space demo

Hongyu Li's profile picture
Hongyu Li1 year ago

This is an exciting work. Congrats!!

Kaifeng Zhang's profile picture
Kaifeng Zhang1 year ago

Thank you Hongyu!

Related Videos