Loading video...
Video Failed to Load
Excited to share my final PhD project😀 We show how simple, yet elegant changes enable diffusion transformers to learn SOTA robotic policies on real robots. Our method improves performance by 20% across a wide range of highly dexterous tasks - like cutting sushi! 1/n
20,536 views • 1 year ago •via X (Twitter)
4 Comments

Our method, DiT-Block Policy, works by adding AdaLN layers to the decoder of a standard transformer diffusion policy. This significantly outperformed standard cross-attention blocks, especially when using fewer DDIM iterations during inference. 2/n

We release all data and code from our project. This includes BiPlay - a more diverse bi-manual manipulation dataset. Each episode in BiPlay consists of randomized objects, tasks, and settings with accompanied language annotations for scalable learning. 3/n

Finally, I’d like to give a shoutout to my collaborators @oier_mees, Sebastian Zhao, @mohansrirama, and @svlevine who made this project possible! For more information, check out our website: n/n

Superb Work!

