正在加载视频...

视频加载失败

加载此视频时出现问题。这可能是由于临时网络问题，或视频可能不可用。

Can we collect robot data without any robots? Introducing Universal Manipulation Interface (UMI) An open-source $400 system from Stanford University designed to democratize robot data collection 0 teleop -> autonomously wash dishes (precise), toss (dynamic), and fold clothes (bimanual)

Cheng Chi

8,017 subscribers

438,741 次观看 • 2 年前 •via X (Twitter)

科学技术教育

Anya Rossi• Live Now

Private livecam show

11 条评论

Cheng Chi 的头像

Cheng Chi2 年前

With UMI, you can go to any home, any restaurant and start data collection within 2 minutes. With a diverse in-the-wild cup manipulation dataset, we can train a diffusion policy that generalizes to the top of a water fountain – clearly unseen environments and objects. 2/9

Cheng Chi 的头像

Cheng Chi2 年前

UMI data is robot agnostic. Here we can deploy the same policy on both UR5e and Franka robots. In fact, you can deploy it on any robot with a parallel jaw stroke > 85mm. 3/9

Cheng Chi 的头像

Cheng Chi2 年前

Enabled by our unique wrist-only camera configuration and camera-centric action representation, our robot systems are calibration-free (works even with base movement) and robust against distractors and lighting changes. 4/9

Cheng Chi 的头像

Cheng Chi2 年前

Please check out our website for code, CAD models, tutorials and even more videos! 5/9

Cheng Chi 的头像

Cheng Chi2 年前

Please also check out our epic fails compilation! We achieve a 70-90% success rate on most tasks, which still doesn’t hit the bar for commercial deployment. However, we think getting a larger in-the-wild dataset will get us a lot closer! 6/9

Cheng Chi 的头像

Cheng Chi2 年前

This project would have been impossible without the hard work from co-authors: @Zhenjia_Xu @chuer_pan @eacousineau @Ben_Burchfiel Siyuan Feng @RussTedrake @SongShuran 7/9

Cheng Chi 的头像

Cheng Chi2 年前

It was a blast working with @tonyzzhao and @zipengfu in the Stanford Robotic Center! 8/9

Cheng Chi 的头像

Cheng Chi2 年前

technologies: GPMF, QR control, Voice control, media mod, max lens … Has been indispensable for this project. Shout out to @David_Newman who personally responded to my questions related to timecodes, which is critical for bimanual UMI. 9/9

Advait 的头像

Advait2 年前

@Stanford really cool! reminds me of this - will have to dive into the paper

Keerthana Gopalakrishnan 的头像

Keerthana Gopalakrishnan2 年前

@Stanford I love this but do you think wrist cam only view point is enough?

Cheng Chi 的头像

Cheng Chi2 年前

@Stanford I think wrist fisheye cams are sufficient for a surprisingly wide range of tasks. I do think there are tasks that could benefit from more views. For those cases, UMI data pipeline supports unlimited number of non-gripper GoPros (e.g. head mounted)

相关视频

Can we learn whole-body mobile manipulation directly from human demonstrations? Introducing Whole-Body Mobile Manipulation Interface (HoMMI) Egocentric + UMI, 0 teleop -> bimanual & whole-body manipulation, long-horizon navigation, active perception

Can we learn whole-body mobile manipulation directly from human demonstrations? Introducing Whole-Body Mobile Manipulation Interface (HoMMI) Egocentric + UMI, 0 teleop -> bimanual & whole-body manipulation, long-horizon navigation, active perception

Xiaomeng Xu

75,518 次观看 • 3 个月前

𝗢𝗽𝗲𝗻𝗔𝗿𝗺 is an open-source robot arm designed for human manipulation data collection. What's new in 𝘃𝟬.𝟮 (beta): ・Teleop with gravity compensation ・Bilateral force feedback support #OpenSource #Robotics #Humanoids #ArtificialIntelligence

𝗢𝗽𝗲𝗻𝗔𝗿𝗺 is an open-source robot arm designed for human manipulation data collection. What's new in 𝘃𝟬.𝟮 (beta): ・Teleop with gravity compensation ・Bilateral force feedback support #OpenSource #Robotics #Humanoids #ArtificialIntelligence

Reazon Human Interaction Lab

43,128 次观看 • 1 年前

Can we use wearable devices to collect robot data without actual robots? Yes! With a pair of gloves🧤! Introducing DexCap, a portable hand motion capture system that collects 3D data (point cloud + finger motion) for training robots with dexterous hands Everything open-sourced

Can we use wearable devices to collect robot data without actual robots? Yes! With a pair of gloves🧤! Introducing DexCap, a portable hand motion capture system that collects 3D data (point cloud + finger motion) for training robots with dexterous hands Everything open-sourced

Chen Wang

234,761 次观看 • 2 年前

So I heard we need more data for robot learning :) Purely real world teleop is expensive and slow, making large scale data collection challenging. I’ve been excited about getting more data into robot learning, going beyond just real-world teleop data. To this end, we’ve been scaling up data generation with RL in realistic simulations generated on the fly from crowdsourced videos. Enables realistic data collection, much more cheaply than purely real world teleop. Importantly, data collection becomes even*cheaper* with more environments, allowing training with over 100x more data. Transfers to real robots for generalizable manipulation. A 🧵 (1/N)

So I heard we need more data for robot learning :) Purely real world teleop is expensive and slow, making large scale data collection challenging. I’ve been excited about getting more data into robot learning, going beyond just real-world teleop data. To this end, we’ve been scaling up data generation with RL in realistic simulations generated on the fly from crowdsourced videos. Enables realistic data collection, much more cheaply than purely real world teleop. Importantly, data collection becomes evencheaper with more environments, allowing training with over 100x more data. Transfers to real robots for generalizable manipulation. A 🧵 (1/N)

Abhishek Gupta

13,336 次观看 • 1 年前

Introducing Yell At Your Robot (YAY Robot!) 🗣️- a fun collaboration b/w Stanford University and UC Berkeley 🤖 We enable robots to improve on-the-fly from language corrections: robots rapidly adapt in real-time and continuously improve from human verbal feedback. YAY Robot enables long-horizon, dexterous manipulation tasks like preparing trail-mix, packing a ziploc bag, and cleaning dishes:

Introducing Yell At Your Robot (YAY Robot!) 🗣️- a fun collaboration b/w Stanford University and UC Berkeley 🤖 We enable robots to improve on-the-fly from language corrections: robots rapidly adapt in real-time and continuously improve from human verbal feedback. YAY Robot enables long-horizon, dexterous manipulation tasks like preparing trail-mix, packing a ziploc bag, and cleaning dishes:

Lucy Shi

122,774 次观看 • 2 年前

We might be solving the wrong problem in robotics. That’s what this makes clear. UMI → Universal Manipulation Interface A simple $400 gripper that lets you teach robots by demonstration. You hold it like a tool. Show the task. The robot learns. No teleoperation. No expensive hardware. No robot-specific data. Stanford open-sourced everything → hardware, code, datasets. What stands out to me is the bottleneck. Not algorithms. Data. Teleoperation → ~35 demos/hour UMI → ~111 demos/hour And the data transfers across robots → UR5, Franka, others. The design is surprisingly practical: → GoPro fisheye lens (155° FOV) + mirrors for depth → SLAM + IMU for precise 6DoF tracking → latency matching for dynamic tasks → diffusion policies for multimodal actions Then it scales. Cheng Chi takes this further with Sunday Robotics (with Tony Zhao). A $200 glove → deployed in 500+ homes → ~10 million real-world interactions. Not lab data. Real human behavior. Their robot learns dishes, laundry, espresso → with zero robot-specific data. This is where the shift becomes obvious. From training robots in controlled environments → to learning directly from humans at scale So here’s the real question: Will robotics be unlocked by better models… or by unlocking data? #ArtificialIntelligence #Robotics #AI #Innovation #FutureOfWork

We might be solving the wrong problem in robotics. That’s what this makes clear. UMI → Universal Manipulation Interface A simple $400 gripper that lets you teach robots by demonstration. You hold it like a tool. Show the task. The robot learns. No teleoperation. No expensive hardware. No robot-specific data. Stanford open-sourced everything → hardware, code, datasets. What stands out to me is the bottleneck. Not algorithms. Data. Teleoperation → ~35 demos/hour UMI → ~111 demos/hour And the data transfers across robots → UR5, Franka, others. The design is surprisingly practical: → GoPro fisheye lens (155° FOV) + mirrors for depth → SLAM + IMU for precise 6DoF tracking → latency matching for dynamic tasks → diffusion policies for multimodal actions Then it scales. Cheng Chi takes this further with Sunday Robotics (with Tony Zhao). A $200 glove → deployed in 500+ homes → ~10 million real-world interactions. Not lab data. Real human behavior. Their robot learns dishes, laundry, espresso → with zero robot-specific data. This is where the shift becomes obvious. From training robots in controlled environments → to learning directly from humans at scale So here’s the real question: Will robotics be unlocked by better models… or by unlocking data? #ArtificialIntelligence #Robotics #AI #Innovation #FutureOfWork

Pascal Bornet

185,867 次观看 • 2 个月前

Humans grasp objects with a purpose! Web2Grasp enables such functional grasping for dexterous robot hands via hand-object reconstruction from web images - without *any* robot teleop data collection 1/n

Humans grasp objects with a purpose! Web2Grasp enables such functional grasping for dexterous robot hands via hand-object reconstruction from web images - without any robot teleop data collection 1/n

Homanga Bharadhwaj @ CVPR

28,408 次观看 • 1 年前

TidyBot++ is an open-source mobile manipulator optimized for household tasks. The robot can be teleoperated using a mobile phone interface, enabling data collection for imitation learning.

TidyBot++ is an open-source mobile manipulator optimized for household tasks. The robot can be teleoperated using a mobile phone interface, enabling data collection for imitation learning.

The Humanoid Hub

29,188 次观看 • 1 年前

Introduce Open-𝐓𝐞𝐥𝐞𝐕𝐢𝐬𝐢𝐨𝐧🤖: ⁣ We need an intuitive and remote teleoperation interface to collect more robot data. 𝐓𝐞𝐥𝐞𝐕𝐢𝐬𝐢𝐨𝐧 lets you immersively operate a robot even if you are 3000 miles away, like in the movie 𝘈𝘷𝘢𝘵𝘢𝘳. Open-sourced!

Introduce Open-𝐓𝐞𝐥𝐞𝐕𝐢𝐬𝐢𝐨𝐧🤖: ⁣ We need an intuitive and remote teleoperation interface to collect more robot data. 𝐓𝐞𝐥𝐞𝐕𝐢𝐬𝐢𝐨𝐧 lets you immersively operate a robot even if you are 3000 miles away, like in the movie 𝘈𝘷𝘢𝘵𝘢𝘳. Open-sourced!

Xuxin Cheng

329,389 次观看 • 2 年前

What representation enables open-world robot manipulation from generated videos? Introducing Dream2Flow, our recent work that bridges video generation and robot control with 3D object flow. Stanford University #ICRA2026 1/N

What representation enables open-world robot manipulation from generated videos? Introducing Dream2Flow, our recent work that bridges video generation and robot control with 3D object flow. Stanford University #ICRA2026 1/N

Wenlong Huang

105,597 次观看 • 3 个月前

Collect robot demos from anywhere through AR! Excited to introduce 🎯DART, Dexterous AR Teleoperation interface enabling anyone to teleoperate robots in cloud-hosted simulation. With DART, anyone can collect robot demos anywhere, anytime, for multiple robots and tasks in one sitting. Every data is automatically logged on our open-sourced cloud database DexHub for public use. 🧵[1/n]

Collect robot demos from anywhere through AR! Excited to introduce 🎯DART, Dexterous AR Teleoperation interface enabling anyone to teleoperate robots in cloud-hosted simulation. With DART, anyone can collect robot demos anywhere, anytime, for multiple robots and tasks in one sitting. Every data is automatically logged on our open-sourced cloud database DexHub for public use. 🧵[1/n]

Younghyo Park

27,257 次观看 • 1 年前

The most frustrating part of imitation learning is collecting huge amounts of teleop data. But why teleop robots when robots can learn by watching us? Introducing Point Policy, a novel framework that enables robots to learn from human videos without any teleop, sim2real, or RL.

The most frustrating part of imitation learning is collecting huge amounts of teleop data. But why teleop robots when robots can learn by watching us? Introducing Point Policy, a novel framework that enables robots to learn from human videos without any teleop, sim2real, or RL.

Siddhant Haldar

69,056 次观看 • 1 年前

Can robots self-improve by collecting data autonomously🤖? Introducing SOAR: a system for large-scale autonomous data collection 🚀 and autonomous improvement📈of a multi-task language-conditioned policy in diverse scenes without human interventions .

Can robots self-improve by collecting data autonomously🤖? Introducing SOAR: a system for large-scale autonomous data collection 🚀 and autonomous improvement📈of a multi-task language-conditioned policy in diverse scenes without human interventions .

Paul Zhou

47,667 次观看 • 1 年前

Open-source robot arm meets hand tracking [📍GitHub below] It is designed with an industrial mindset but built as a 3D-printed desktop system. PAROL6 paired with a LEAP Motion controller is a nice example of how accessible robot teleoperation has become. • Hand motion is streamed to the robot at 100 Hz via UDP • A pneumatic gripper is controlled by simple fist open and close gestures • The entire robot stack is open source, from mechanics to control software Combine that with low-latency hand tracking and you get a very practical platform for learning manipulation, teleoperation, and human-robot interfaces. This kind of setup is great for experimentation, teleop, data collection, and teaching robots by demonstration All without proprietary hardware or locked software. Credit to SourceRobotics 📍Code: —— Weekly robotics and AI insights. Subscribe free:

Open-source robot arm meets hand tracking [📍GitHub below] It is designed with an industrial mindset but built as a 3D-printed desktop system. PAROL6 paired with a LEAP Motion controller is a nice example of how accessible robot teleoperation has become. • Hand motion is streamed to the robot at 100 Hz via UDP • A pneumatic gripper is controlled by simple fist open and close gestures • The entire robot stack is open source, from mechanics to control software Combine that with low-latency hand tracking and you get a very practical platform for learning manipulation, teleoperation, and human-robot interfaces. This kind of setup is great for experimentation, teleop, data collection, and teaching robots by demonstration All without proprietary hardware or locked software. Credit to SourceRobotics 📍Code: —— Weekly robotics and AI insights. Subscribe free:

Ilir Aliu

32,025 次观看 • 5 个月前

Tired of teleoperating your robots? We built a way to scale robot datasets without teleop, dynamic simulation, or even robot hardware. Just one smartphone scan + one human hand demo video → thousands of diverse robot trajectories. Trainable by diffusion policy and VLA models as-is. Introducing: Real2Render2Real 👉

Tired of teleoperating your robots? We built a way to scale robot datasets without teleop, dynamic simulation, or even robot hardware. Just one smartphone scan + one human hand demo video → thousands of diverse robot trajectories. Trainable by diffusion policy and VLA models as-is. Introducing: Real2Render2Real 👉

Max Fu

69,224 次观看 • 1 年前

Collecting data is a PAAAAAAAIN❗ If only there were a solution out there... (𝗢𝗽𝗲𝗻-𝗦𝗼𝘂𝗿𝗰𝗲 😱) Jannik Grothusen, researchers and founder in stealth mode 👀 from TU Munich, has created a new tool that helps collect data for robots to learn how to perform tasks. It’s inspired by a device from Stanford, but it’s even better! Here’s why: ✅ It can collect and process data in real-time, meaning no delays. ✅ It’s super easy to control, so collecting data is much faster and simpler. ✅ It can be used with different robot parts because of its smart, modular design. Best of all, this amazing tool is 𝗼𝗽𝗲𝗻-𝘀𝗼𝘂𝗿𝗰𝗲, so anyone can use it or improve it! Github: The system is inspired by Cheng Chi; Universal Manipulation Interface (UMI). Credit: Jannik Grothusen

Collecting data is a PAAAAAAAIN❗ If only there were a solution out there... (𝗢𝗽𝗲𝗻-𝗦𝗼𝘂𝗿𝗰𝗲 😱) Jannik Grothusen, researchers and founder in stealth mode 👀 from TU Munich, has created a new tool that helps collect data for robots to learn how to perform tasks. It’s inspired by a device from Stanford, but it’s even better! Here’s why: ✅ It can collect and process data in real-time, meaning no delays. ✅ It’s super easy to control, so collecting data is much faster and simpler. ✅ It can be used with different robot parts because of its smart, modular design. Best of all, this amazing tool is 𝗼𝗽𝗲𝗻-𝘀𝗼𝘂𝗿𝗰𝗲, so anyone can use it or improve it! Github: The system is inspired by Cheng Chi; Universal Manipulation Interface (UMI). Credit: Jannik Grothusen

Ilir Aliu

47,397 次观看 • 1 年前

Releasing the Unfolding Robotics blog! Time to unfold robotics: we trained a robot to fold clothes using 8 bimanual setups, 100+ hours of demonstrations, and 5k+ GPU hours. Flashy robot demos are everywhere. But you rarely see the real story: the data, the failures, the engineering. We’re sharing everything: code, data, and details in the blog →

Releasing the Unfolding Robotics blog! Time to unfold robotics: we trained a robot to fold clothes using 8 bimanual setups, 100+ hours of demonstrations, and 5k+ GPU hours. Flashy robot demos are everywhere. But you rarely see the real story: the data, the failures, the engineering. We’re sharing everything: code, data, and details in the blog →

LeRobot

280,565 次观看 • 2 个月前

Imagine robots learning new skills—without any robot data. Today, we're excited to release EgoZero: our first steps in training robot policies that operate in unseen environments, solely from data collected through humans wearing Aria smart glasses. 🧵👇

Imagine robots learning new skills—without any robot data. Today, we're excited to release EgoZero: our first steps in training robot policies that operate in unseen environments, solely from data collected through humans wearing Aria smart glasses. 🧵👇

Lerrel Pinto

42,555 次观看 • 1 年前

At OpenMind, we’re building an open-source AI operating system that works across any robot. With OpenMind’s Brainpack + @NVIDIA Jetson Thor, robots can run OM1 software and gain full autonomy, even if they weren’t designed for it.

At OpenMind, we’re building an open-source AI operating system that works across any robot. With OpenMind’s Brainpack + @NVIDIA Jetson Thor, robots can run OM1 software and gain full autonomy, even if they weren’t designed for it.

OpenMind

68,720 次观看 • 5 个月前

OpenArm is a fully open-source robotic arm designed for human manipulation data collection - developed by the Reazon Human Interaction Lab in Japan. V0.2 beta features smoother teleop with gravity compensation and bilateral force feedback.

OpenArm is a fully open-source robotic arm designed for human manipulation data collection - developed by the Reazon Human Interaction Lab in Japan. V0.2 beta features smoother teleop with gravity compensation and bilateral force feedback.

The Humanoid Hub

38,190 次观看 • 1 年前