Загрузка видео...
Не удалось загрузить видео
🎉 Diffusion-style annealing + sampling-based MPC can surpass RL, and seamlessly adapt to task parameters, all 𝘁𝗿𝗮𝗶𝗻𝗶𝗻𝗴-𝗳𝗿𝗲𝗲! We open sourced DIAL-MPC, the first training-free method for whole-body torque control using full-order dynamics 🧵
172,397 просмотров • 1 год назад •via X (Twitter)
Комментарии: 11

1/9 We surprisingly found that the formulation of sampling-based MPC and single diffusion step are 𝗲𝗾𝘂𝗶𝘃𝗮𝗹𝗲𝗻𝘁. This motivates us to do multi-step diffusion in MPC. (theoretical proofs in paper) 𝗥𝗲𝗮𝗹-𝘁𝗶𝗺𝗲 sampling is possible! We did it thanks to recent advancement in massively parallel simulation.

2/9 There are two diffusion-style annealings in DIAL-MPC: Trajectory-level: within a single timestep, we iteratively re-sample with adaptive distribution, which is equivalent to denoising in diffusion.

3/9 And action-level: we leverage the receeding-horizon nature of MPC to re-use partially diffused actions in future steps.

4/9 Some DIAL-MPC tasks can be deployed directly to real. We achieve 𝗿𝗲𝗮𝗹-𝘁𝗶𝗺𝗲 𝘁𝗼𝗿𝗾𝘂𝗲 𝗰𝗼𝗻𝘁𝗿𝗼𝗹 on a quadruped doing versatile tasks. We can even give it a very heavy payload (6 kg 👇), and it could easily adapt to it after modifying the mass in the model.

5/9 Why is DIAL-MPC better than conventional sampling-based MPC? In our paper and website, we includes a toy experiment with a very rough cost function landscape to explain the theory.

6/9 Although not an 🍎 to 📷 comparison, why can DIAL-MPC be better than RL policy? 1. It is training-free: all reward designs can be instantly verified. 2. It is explicitly adaptive: facing OOD tasks like📷

7/9 We emphasize that DIAL-MPC is not meant to compete with RL - its future directions align very well with RL: 1. Add nominal value and policy functions to shorten DIAL-MPC horizon needs. 2. It can be an "RL reward visualizer" to accelerate RL reward engineering.

8/9 DIAL-MPC is parallel to popular works in using diffusion-style annealing for robot learning. A significant distinction is that DIAL-MPC is model-based whereas these are model-free: - Diffusion Policy Policy Optimization - Streaming Diffusion Policy

9/9 Thanks for reading through! Check out our project website! We also open-source the code on GitHub to run all DIAL-MPC demos, including the rendering pipeline. Follow the authors @HaoruXue @ChaoyiPan @zejiyi @guannanqu @GuanyaShi for more updates!

Looks amazing! Congrats! And cudos on the visuals too!

thank you! we also open-sourced our blender pipeline in the codebase, courtesy of the UMI on Legs authors


