Video yükleniyor...

Video Yüklenemedi

Ana Sayfaya Dön

Can we collect robot data without any robots? Introducing Universal Manipulation Interface (UMI) An open-source $400 system from Stanford University designed to democratize robot data collection 0 teleop -> autonomously wash dishes (precise), toss (dynamic), and fold clothes (bimanual)

438,672 görüntüleme • 2 yıl önce •via X (Twitter)

11 Yorum

Cheng Chi profil fotoğrafı
Cheng Chi2 yıl önce

With UMI, you can go to any home, any restaurant and start data collection within 2 minutes. With a diverse in-the-wild cup manipulation dataset, we can train a diffusion policy that generalizes to the top of a water fountain – clearly unseen environments and objects. 2/9

Cheng Chi profil fotoğrafı
Cheng Chi2 yıl önce

UMI data is robot agnostic. Here we can deploy the same policy on both UR5e and Franka robots. In fact, you can deploy it on any robot with a parallel jaw stroke > 85mm. 3/9

Cheng Chi profil fotoğrafı
Cheng Chi2 yıl önce

Enabled by our unique wrist-only camera configuration and camera-centric action representation, our robot systems are calibration-free (works even with base movement) and robust against distractors and lighting changes. 4/9

Cheng Chi profil fotoğrafı
Cheng Chi2 yıl önce

Please check out our website for code, CAD models, tutorials and even more videos! 5/9

Cheng Chi profil fotoğrafı
Cheng Chi2 yıl önce

Please also check out our epic fails compilation! We achieve a 70-90% success rate on most tasks, which still doesn’t hit the bar for commercial deployment. However, we think getting a larger in-the-wild dataset will get us a lot closer! 6/9

Cheng Chi profil fotoğrafı
Cheng Chi2 yıl önce

This project would have been impossible without the hard work from co-authors: @Zhenjia_Xu @chuer_pan @eacousineau @Ben_Burchfiel Siyuan Feng @RussTedrake @SongShuran 7/9

Cheng Chi profil fotoğrafı
Cheng Chi2 yıl önce

It was a blast working with @tonyzzhao and @zipengfu in the Stanford Robotic Center! 8/9

Cheng Chi profil fotoğrafı
Cheng Chi2 yıl önce

technologies: GPMF, QR control, Voice control, media mod, max lens … Has been indispensable for this project. Shout out to @David_Newman who personally responded to my questions related to timecodes, which is critical for bimanual UMI. 9/9

Advait profil fotoğrafı
Advait2 yıl önce

@Stanford really cool! reminds me of this - will have to dive into the paper

Keerthana Gopalakrishnan profil fotoğrafı
Keerthana Gopalakrishnan2 yıl önce

@Stanford I love this but do you think wrist cam only view point is enough?

Cheng Chi profil fotoğrafı
Cheng Chi2 yıl önce

@Stanford I think wrist fisheye cams are sufficient for a surprisingly wide range of tasks. I do think there are tasks that could benefit from more views. For those cases, UMI data pipeline supports unlimited number of non-gripper GoPros (e.g. head mounted)

Benzer Videolar