Video yükleniyor...

Video Yüklenemedi

Bu video yüklenirken bir sorun oluştu. Bu geçici bir ağ sorunundan kaynaklanıyor olabilir veya video kullanılamıyor olabilir.

Ana Sayfaya Dön

Humans grasp objects with a purpose! Web2Grasp enables such functional grasping for dexterous robot hands via hand-object reconstruction from web images - without any robot teleop data collection 1/n

Homanga Bharadhwaj @ CVPR

3,024 subscribers

28,408 görüntüleme • 1 yıl önce •via X (Twitter)

Sağlık & İyilik Bilim & Teknoloji Eğitim

Anya Rossi• Live Now

Private livecam show

4 Yorum

Homanga @ CVPR profil fotoğrafı

Homanga @ CVPR1 yıl önce

To train a functional grasp prediction policy: -> we create a dataset of robot-object grasps by re-targeting human hand-object interactions from web images followed by object mesh refinement with a text-to-3D model -> perform imitation learning on the resulting dataset! 2/n

Homanga @ CVPR profil fotoğrafı

Homanga @ CVPR1 yıl önce

Web2Grasp was led by amazing @CMU_Robotics collaborators @chen_hongyi_ and @YaoYunchao Check out Hongyi's thread below and the website for more details n/n

AirFranz profil fotoğrafı

AirFranz1 yıl önce

Hey Homanga! Great work. Just DM’ed you. Franz

Craig ⚔️ profil fotoğrafı

Craig ⚔️1 yıl önce

Is this 1994 tech?

Benzer Videolar

How to learn dexterous manipulation for any robot hand from a single human demonstration? Check out DexMachina, our new RL algorithm that learns long-horizon, bimanual dexterous policies for a variety of dexterous hands, articulated objects, and complex motions.

How to learn dexterous manipulation for any robot hand from a single human demonstration? Check out DexMachina, our new RL algorithm that learns long-horizon, bimanual dexterous policies for a variety of dexterous hands, articulated objects, and complex motions.

Mandi Zhao

120,188 görüntüleme • 1 yıl önce

So I heard we need more data for robot learning :) Purely real world teleop is expensive and slow, making large scale data collection challenging. I’ve been excited about getting more data into robot learning, going beyond just real-world teleop data. To this end, we’ve been scaling up data generation with RL in realistic simulations generated on the fly from crowdsourced videos. Enables realistic data collection, much more cheaply than purely real world teleop. Importantly, data collection becomes even*cheaper* with more environments, allowing training with over 100x more data. Transfers to real robots for generalizable manipulation. A 🧵 (1/N)

So I heard we need more data for robot learning :) Purely real world teleop is expensive and slow, making large scale data collection challenging. I’ve been excited about getting more data into robot learning, going beyond just real-world teleop data. To this end, we’ve been scaling up data generation with RL in realistic simulations generated on the fly from crowdsourced videos. Enables realistic data collection, much more cheaply than purely real world teleop. Importantly, data collection becomes evencheaper with more environments, allowing training with over 100x more data. Transfers to real robots for generalizable manipulation. A 🧵 (1/N)

Abhishek Gupta

13,336 görüntüleme • 1 yıl önce

Can we collect robot data without any robots? Introducing Universal Manipulation Interface (UMI) An open-source $400 system from Stanford University designed to democratize robot data collection 0 teleop -> autonomously wash dishes (precise), toss (dynamic), and fold clothes (bimanual)

Can we collect robot data without any robots? Introducing Universal Manipulation Interface (UMI) An open-source $400 system from Stanford University designed to democratize robot data collection 0 teleop -> autonomously wash dishes (precise), toss (dynamic), and fold clothes (bimanual)

Cheng Chi

438,741 görüntüleme • 2 yıl önce

What representation enables open-world robot manipulation from generated videos? Introducing Dream2Flow, our recent work that bridges video generation and robot control with 3D object flow. Stanford University #ICRA2026 1/N

What representation enables open-world robot manipulation from generated videos? Introducing Dream2Flow, our recent work that bridges video generation and robot control with 3D object flow. Stanford University #ICRA2026 1/N

Wenlong Huang

105,429 görüntüleme • 2 ay önce

In-hand object manipulation is a dexterity litmus test for robot hands. Our new system in Science Robotics Science Robotics can dynamically reorient many different objects in hand in the air. 📚Project website: 🧑‍💻Code: 🧵1/n

In-hand object manipulation is a dexterity litmus test for robot hands. Our new system in Science Robotics Science Robotics can dynamically reorient many different objects in hand in the air. 📚Project website: 🧑‍💻Code: 🧵1/n

Tao Chen

30,666 görüntüleme • 2 yıl önce

These days, it feels like a new robotic hand is developed every week. But how can we give them all vision-based general grasping ability? Meet AnyDexGrasp: it enables robust grasping across different robot hands—with just 40 training objects and 4–8 hours of trial and error!

These days, it feels like a new robotic hand is developed every week. But how can we give them all vision-based general grasping ability? Meet AnyDexGrasp: it enables robust grasping across different robot hands—with just 40 training objects and 4–8 hours of trial and error!

Hao-Shu Fang

25,409 görüntüleme • 1 yıl önce

Can we use wearable devices to collect robot data without actual robots? Yes! With a pair of gloves🧤! Introducing DexCap, a portable hand motion capture system that collects 3D data (point cloud + finger motion) for training robots with dexterous hands Everything open-sourced

Can we use wearable devices to collect robot data without actual robots? Yes! With a pair of gloves🧤! Introducing DexCap, a portable hand motion capture system that collects 3D data (point cloud + finger motion) for training robots with dexterous hands Everything open-sourced

Chen Wang

234,658 görüntüleme • 2 yıl önce

Low-cost teleop systems have democratized robot data collection, but they lack any force feedback, making it challenging to teleoperate contact-rich tasks. Many robot arms provide force information — a critical yet underutilized modality in robot learning. We introduce: 1. 🦾A low-cost, force-feedback-enabled teleop system. 2. 🥊Force-Attending Curriculum Training (FACTR) uses force to improve generalization in complex, contact-rich tasks. 🧵(1/N)

Low-cost teleop systems have democratized robot data collection, but they lack any force feedback, making it challenging to teleoperate contact-rich tasks. Many robot arms provide force information — a critical yet underutilized modality in robot learning. We introduce: 1. 🦾A low-cost, force-feedback-enabled teleop system. 2. 🥊Force-Attending Curriculum Training (FACTR) uses force to improve generalization in complex, contact-rich tasks. 🧵(1/N)

Jason Liu

151,530 görüntüleme • 1 yıl önce

Introducing Omnigrasp: Grasping Diverse Objects with Simulated Humanoids. With Omnigrasp, we show that we can control a humanoid equipped with dexterous hands to grasp diverse objects (>1200) and follow diverse trajectories, with one policy! 🌐: 📜:

Introducing Omnigrasp: Grasping Diverse Objects with Simulated Humanoids. With Omnigrasp, we show that we can control a humanoid equipped with dexterous hands to grasp diverse objects (>1200) and follow diverse trajectories, with one policy! 🌐: 📜:

Zhengyi “Zen” Luo

92,104 görüntüleme • 1 yıl önce

We talked to Ritvik Singh about how you can train sim-to-real dexterous manipulation policies using NVIDIA Isaac. This robot is grasping objects using pure RGB stereo: take in images from a camera pair and predict what to do, all without training in the real world.

We talked to Ritvik Singh about how you can train sim-to-real dexterous manipulation policies using NVIDIA Isaac. This robot is grasping objects using pure RGB stereo: take in images from a camera pair and predict what to do, all without training in the real world.

Chris Paxton

20,042 görüntüleme • 9 ay önce

Thrilled to share one of my favorite works this year: DexNDM! We bridge the Sim2Real gap for dexterous in-hand rotation, achieving a true "0-to-1" advancement. The key? DexNDM learns from biased, real-world data without needing any successful demonstrations. Now a general-purpose dexterous hand can stably rotate large books, long rods, & complex objects around any axis, from any wrist pose. This powerful primitive enables complex, long-horizon tasks like teleoperated screwing and furniture assembly. 📄 Paper: 🌐 Project Page:

Thrilled to share one of my favorite works this year: DexNDM! We bridge the Sim2Real gap for dexterous in-hand rotation, achieving a true "0-to-1" advancement. The key? DexNDM learns from biased, real-world data without needing any successful demonstrations. Now a general-purpose dexterous hand can stably rotate large books, long rods, & complex objects around any axis, from any wrist pose. This powerful primitive enables complex, long-horizon tasks like teleoperated screwing and furniture assembly. 📄 Paper: 🌐 Project Page:

Li Yi

38,466 görüntüleme • 7 ay önce

Vision-language models perform diverse tasks via in-context learning. Time for robots to do the same! Introducing In-Context Robot Transformer (ICRT): a robot policy that learns new tasks by prompting with robot trajectories, without any fine-tuning. [1/N]

Vision-language models perform diverse tasks via in-context learning. Time for robots to do the same! Introducing In-Context Robot Transformer (ICRT): a robot policy that learns new tasks by prompting with robot trajectories, without any fine-tuning. [1/N]

Max Fu

40,392 görüntüleme • 1 yıl önce

Boston Dynamics collaborated with NVIDIA to demonstrate DextrAH-RGB, a workflow for dexterous grasping from stereo RGB input. The end-to-end policy for Atlas robot, trained entirely in NVIDIA Isaac Lab, transfers zero-shot from simulation to the real robot.

Boston Dynamics collaborated with NVIDIA to demonstrate DextrAH-RGB, a workflow for dexterous grasping from stereo RGB input. The end-to-end policy for Atlas robot, trained entirely in NVIDIA Isaac Lab, transfers zero-shot from simulation to the real robot.

The Humanoid Hub

79,696 görüntüleme • 1 yıl önce

Collecting dexterous humanoid robot data is difficult to scale. That's why Mengda Xu and Han Zhang built DexUMI: a tool for demonstrating how to control a dexterous robot hand, which allows you to quickly collect task data. Co-hosted by Michael Cho - Rbt/Acc and Chris Paxton

Collecting dexterous humanoid robot data is difficult to scale. That's why Mengda Xu and Han Zhang built DexUMI: a tool for demonstrating how to control a dexterous robot hand, which allows you to quickly collect task data. Co-hosted by Michael Cho - Rbt/Acc and Chris Paxton

RoboPapers

26,290 görüntüleme • 10 ay önce

We trained a humanoid with 22-DoF dexterous hands to assemble model cars, operate syringes, sort poker cards, fold/roll shirts, all learned primarily from 20,000+ hours of egocentric human video with no robot in the loop. Humans are the most scalable embodiment on the planet. We discovered a near-perfect log-linear scaling law (R² = 0.998) between human video volume and action prediction loss, and this loss directly predicts real-robot success rate. Humanoid robots will be the end game, because they are the practical form factor with minimal embodiment gap from humans. Call it the Bitter Lesson of robot hardware: the kinematic similarity lets us simply retarget human finger motion onto dexterous robot hand joints. No learned embeddings, no fancy transfer algorithms needed. Relative wrist motion + retargeted 22-DoF finger actions serve as a unified action space that carries through from pre-training to robot execution. Our recipe is called "EgoScale": - Pre-train GR00T N1.5 on 20K hours of human video, mid-train with only 4 hours (!) of robot play data with Sharpa hands. 54% gains over training from scratch across 5 highly dexterous tasks. - Most surprising result: a *single* teleop demo is sufficient to learn a never-before-seen task. Our recipe enables extreme data efficiency. - Although we pre-train in 22-DoF hand joint space, the policy transfers to a Unitree G1 with 7-DoF tri-finger hands. 30%+ gains over training on G1 data alone. The scalable path to robot dexterity was never more robots. It was always us. Deep dives in thread:

We trained a humanoid with 22-DoF dexterous hands to assemble model cars, operate syringes, sort poker cards, fold/roll shirts, all learned primarily from 20,000+ hours of egocentric human video with no robot in the loop. Humans are the most scalable embodiment on the planet. We discovered a near-perfect log-linear scaling law (R² = 0.998) between human video volume and action prediction loss, and this loss directly predicts real-robot success rate. Humanoid robots will be the end game, because they are the practical form factor with minimal embodiment gap from humans. Call it the Bitter Lesson of robot hardware: the kinematic similarity lets us simply retarget human finger motion onto dexterous robot hand joints. No learned embeddings, no fancy transfer algorithms needed. Relative wrist motion + retargeted 22-DoF finger actions serve as a unified action space that carries through from pre-training to robot execution. Our recipe is called "EgoScale": - Pre-train GR00T N1.5 on 20K hours of human video, mid-train with only 4 hours (!) of robot play data with Sharpa hands. 54% gains over training from scratch across 5 highly dexterous tasks. - Most surprising result: a single teleop demo is sufficient to learn a never-before-seen task. Our recipe enables extreme data efficiency. - Although we pre-train in 22-DoF hand joint space, the policy transfers to a Unitree G1 with 7-DoF tri-finger hands. 30%+ gains over training on G1 data alone. The scalable path to robot dexterity was never more robots. It was always us. Deep dives in thread:

Jim Fan

291,303 görüntüleme • 3 ay önce

Open-source dexterous hands with fingertip sensors! 🪬 ORCA Dexterity just released three dexterous hand models. Their mission: is to democratize robotic hand dexterity. They're sharing progress on exceptional, low-cost hardware and a software layer from low-level control to robotic hand learning. This is the open-source approach to dexterous manipulation. It has 83 taxels per finger with 0.1 N force detection which pretty is impressive for an open-source design. Tactile sensing is critical for dexterous manipulation, knowing contact forces enables gentle grasping, slip detection, and force-controlled assembly. Also 700g weight for the lite version makes it practical for mounting on robot arms without exceeding payload limits. Lower weight means faster movements and lower torque requirements. Open hardware accelerates robotics by letting researchers and builders modify designs for their specific needs without starting from scratch! ~~ ♻️ Join the weekly robotics newsletter, and never miss any news →

Open-source dexterous hands with fingertip sensors! 🪬 ORCA Dexterity just released three dexterous hand models. Their mission: is to democratize robotic hand dexterity. They're sharing progress on exceptional, low-cost hardware and a software layer from low-level control to robotic hand learning. This is the open-source approach to dexterous manipulation. It has 83 taxels per finger with 0.1 N force detection which pretty is impressive for an open-source design. Tactile sensing is critical for dexterous manipulation, knowing contact forces enables gentle grasping, slip detection, and force-controlled assembly. Also 700g weight for the lite version makes it practical for mounting on robot arms without exceeding payload limits. Lower weight means faster movements and lower torque requirements. Open hardware accelerates robotics by letting researchers and builders modify designs for their specific needs without starting from scratch! ~~ ♻️ Join the weekly robotics newsletter, and never miss any news →

Lukas Ziegler

38,019 görüntüleme • 3 ay önce

Tired of teleoperation? One human video → 1,000s of robot demos. (📍GitHub ) Scaling Robot Data Without Dynamics Simulation or Robot Hardware Real2Render2Real (R2R2R) is a new way to scale robot data without physics simulation or hardware. You take a phone scan + a single monocular human demo. It tracks the motion, renders photorealistic scenes, and generates diverse, robot-agnostic trajectories ready for training. > No teleop, no sim, no robot, just a phone and a video > Train VLA models and diffusion policies directly on the output > Supports multiple robot embodiments with kinematic consistency > 1000s of demos in 1/27 the time of real-world collection Thank you, Max Fu, for sharing!! Project: Paper: Code coming soon: It shows that with the right pipeline, you can scale robot learning data without touching a robot. One of the most interesting directions in scalable robotics today. —— Weekly robotics and AI insights. Subscribe free:

Tired of teleoperation? One human video → 1,000s of robot demos. (📍GitHub ) Scaling Robot Data Without Dynamics Simulation or Robot Hardware Real2Render2Real (R2R2R) is a new way to scale robot data without physics simulation or hardware. You take a phone scan + a single monocular human demo. It tracks the motion, renders photorealistic scenes, and generates diverse, robot-agnostic trajectories ready for training. > No teleop, no sim, no robot, just a phone and a video > Train VLA models and diffusion policies directly on the output > Supports multiple robot embodiments with kinematic consistency > 1000s of demos in 1/27 the time of real-world collection Thank you, Max Fu, for sharing!! Project: Paper: Code coming soon: It shows that with the right pipeline, you can scale robot learning data without touching a robot. One of the most interesting directions in scalable robotics today. —— Weekly robotics and AI insights. Subscribe free:

Ilir Aliu

42,391 görüntüleme • 4 ay önce

Imagine robots learning new skills—without any robot data. Today, we're excited to release EgoZero: our first steps in training robot policies that operate in unseen environments, solely from data collected through humans wearing Aria smart glasses. 🧵👇

Imagine robots learning new skills—without any robot data. Today, we're excited to release EgoZero: our first steps in training robot policies that operate in unseen environments, solely from data collected through humans wearing Aria smart glasses. 🧵👇

Lerrel Pinto

42,538 görüntüleme • 1 yıl önce

What if robots could learn real-world tasks from your perspective… without ever touching a robot? This is a system that trains robot policies using nothing but human-first, egocentric video data from smart glasses. No robots, no teleop, no sensors, just humans doing real tasks in the real world. Why it matters ✅ Learns robot policies from 20 minutes of human video; zero robot demos ✅ Generalizes to new objects, views, and even robot morphologies ✅ Uses 3D points for interpretable, spatially grounded learning ✅ Deploys directly to real-world robots with strong zero-shot success Thank you, Vincent Liu, for sharing!!! Learn more here: 🔗 Paper: 🌐 Website: 📍 BOOKMARK FOR LATER

What if robots could learn real-world tasks from your perspective… without ever touching a robot? This is a system that trains robot policies using nothing but human-first, egocentric video data from smart glasses. No robots, no teleop, no sensors, just humans doing real tasks in the real world. Why it matters ✅ Learns robot policies from 20 minutes of human video; zero robot demos ✅ Generalizes to new objects, views, and even robot morphologies ✅ Uses 3D points for interpretable, spatially grounded learning ✅ Deploys directly to real-world robots with strong zero-shot success Thank you, Vincent Liu, for sharing!!! Learn more here: 🔗 Paper: 🌐 Website: 📍 BOOKMARK FOR LATER

Ilir Aliu - eu/acc

10,509 görüntüleme • 1 yıl önce

Sim2Real RL for Vision-Based Dexterous Manipulation on Humanoids TLDR - we train a humanoid robot with two multifingered hands to perform a range of dexterous manipulation tasks robust generalization and high performance without human demonstration :D

Sim2Real RL for Vision-Based Dexterous Manipulation on Humanoids TLDR - we train a humanoid robot with two multifingered hands to perform a range of dexterous manipulation tasks robust generalization and high performance without human demonstration :D

Toru

49,425 görüntüleme • 1 yıl önce