Loading video...

Video Failed to Load

There was a problem loading this video. This could be due to a temporary network issue or the video might be unavailable.

Humans grasp objects with a purpose! Web2Grasp enables such functional grasping for dexterous robot hands via hand-object reconstruction from web images - without any robot teleop data collection 1/n

Homanga Bharadhwaj @ CVPR

3,024 subscribers

28,408 views • 1 year ago •via X (Twitter)

Health & Wellness Science & Technology Education

Anya Rossi• Live Now

Private livecam show

4 Comments

Homanga @ CVPR1 year ago

To train a functional grasp prediction policy: -> we create a dataset of robot-object grasps by re-targeting human hand-object interactions from web images followed by object mesh refinement with a text-to-3D model -> perform imitation learning on the resulting dataset! 2/n

Homanga @ CVPR1 year ago

Web2Grasp was led by amazing @CMU_Robotics collaborators @chen_hongyi_ and @YaoYunchao Check out Hongyi's thread below and the website for more details n/n

AirFranz1 year ago

Hey Homanga! Great work. Just DM’ed you. Franz

Craig ⚔️1 year ago

Is this 1994 tech?

Related Videos

How to learn dexterous manipulation for any robot hand from a single human demonstration? Check out DexMachina, our new RL algorithm that learns long-horizon, bimanual dexterous policies for a variety of dexterous hands, articulated objects, and complex motions.

How to learn dexterous manipulation for any robot hand from a single human demonstration? Check out DexMachina, our new RL algorithm that learns long-horizon, bimanual dexterous policies for a variety of dexterous hands, articulated objects, and complex motions.

Mandi Zhao

120,954 views • 1 year ago

So I heard we need more data for robot learning :) Purely real world teleop is expensive and slow, making large scale data collection challenging. I’ve been excited about getting more data into robot learning, going beyond just real-world teleop data. To this end, we’ve been scaling up data generation with RL in realistic simulations generated on the fly from crowdsourced videos. Enables realistic data collection, much more cheaply than purely real world teleop. Importantly, data collection becomes even*cheaper* with more environments, allowing training with over 100x more data. Transfers to real robots for generalizable manipulation. A 🧵 (1/N)

So I heard we need more data for robot learning :) Purely real world teleop is expensive and slow, making large scale data collection challenging. I’ve been excited about getting more data into robot learning, going beyond just real-world teleop data. To this end, we’ve been scaling up data generation with RL in realistic simulations generated on the fly from crowdsourced videos. Enables realistic data collection, much more cheaply than purely real world teleop. Importantly, data collection becomes evencheaper with more environments, allowing training with over 100x more data. Transfers to real robots for generalizable manipulation. A 🧵 (1/N)

Abhishek Gupta

13,350 views • 1 year ago

These days, it feels like a new robotic hand is developed every week. But how can we give them all vision-based general grasping ability? Meet AnyDexGrasp: it enables robust grasping across different robot hands—with just 40 training objects and 4–8 hours of trial and error!

These days, it feels like a new robotic hand is developed every week. But how can we give them all vision-based general grasping ability? Meet AnyDexGrasp: it enables robust grasping across different robot hands—with just 40 training objects and 4–8 hours of trial and error!

Hao-Shu Fang

25,409 views • 1 year ago

In-hand object manipulation is a dexterity litmus test for robot hands. Our new system in Science Robotics Science Robotics can dynamically reorient many different objects in hand in the air. 📚Project website: 🧑‍💻Code: 🧵1/n

In-hand object manipulation is a dexterity litmus test for robot hands. Our new system in Science Robotics Science Robotics can dynamically reorient many different objects in hand in the air. 📚Project website: 🧑‍💻Code: 🧵1/n

Tao Chen

30,666 views • 2 years ago

Can we use wearable devices to collect robot data without actual robots? Yes! With a pair of gloves🧤! Introducing DexCap, a portable hand motion capture system that collects 3D data (point cloud + finger motion) for training robots with dexterous hands Everything open-sourced

Can we use wearable devices to collect robot data without actual robots? Yes! With a pair of gloves🧤! Introducing DexCap, a portable hand motion capture system that collects 3D data (point cloud + finger motion) for training robots with dexterous hands Everything open-sourced

Chen Wang

234,949 views • 2 years ago

Low-cost teleop systems have democratized robot data collection, but they lack any force feedback, making it challenging to teleoperate contact-rich tasks. Many robot arms provide force information — a critical yet underutilized modality in robot learning. We introduce: 1. 🦾A low-cost, force-feedback-enabled teleop system. 2. 🥊Force-Attending Curriculum Training (FACTR) uses force to improve generalization in complex, contact-rich tasks. 🧵(1/N)

Low-cost teleop systems have democratized robot data collection, but they lack any force feedback, making it challenging to teleoperate contact-rich tasks. Many robot arms provide force information — a critical yet underutilized modality in robot learning. We introduce: 1. 🦾A low-cost, force-feedback-enabled teleop system. 2. 🥊Force-Attending Curriculum Training (FACTR) uses force to improve generalization in complex, contact-rich tasks. 🧵(1/N)

Jason Liu

151,760 views • 1 year ago

We talked to Ritvik Singh about how you can train sim-to-real dexterous manipulation policies using NVIDIA Isaac. This robot is grasping objects using pure RGB stereo: take in images from a camera pair and predict what to do, all without training in the real world.

We talked to Ritvik Singh about how you can train sim-to-real dexterous manipulation policies using NVIDIA Isaac. This robot is grasping objects using pure RGB stereo: take in images from a camera pair and predict what to do, all without training in the real world.

Chris Paxton

20,067 views • 11 months ago

Thrilled to share one of my favorite works this year: DexNDM! We bridge the Sim2Real gap for dexterous in-hand rotation, achieving a true "0-to-1" advancement. The key? DexNDM learns from biased, real-world data without needing any successful demonstrations. Now a general-purpose dexterous hand can stably rotate large books, long rods, & complex objects around any axis, from any wrist pose. This powerful primitive enables complex, long-horizon tasks like teleoperated screwing and furniture assembly. 📄 Paper: 🌐 Project Page:

Thrilled to share one of my favorite works this year: DexNDM! We bridge the Sim2Real gap for dexterous in-hand rotation, achieving a true "0-to-1" advancement. The key? DexNDM learns from biased, real-world data without needing any successful demonstrations. Now a general-purpose dexterous hand can stably rotate large books, long rods, & complex objects around any axis, from any wrist pose. This powerful primitive enables complex, long-horizon tasks like teleoperated screwing and furniture assembly. 📄 Paper: 🌐 Project Page:

Li Yi

38,698 views • 8 months ago

Vision-language models perform diverse tasks via in-context learning. Time for robots to do the same! Introducing In-Context Robot Transformer (ICRT): a robot policy that learns new tasks by prompting with robot trajectories, without any fine-tuning. [1/N]

Vision-language models perform diverse tasks via in-context learning. Time for robots to do the same! Introducing In-Context Robot Transformer (ICRT): a robot policy that learns new tasks by prompting with robot trajectories, without any fine-tuning. [1/N]

Max Fu

40,451 views • 1 year ago

We trained a humanoid with 22-DoF dexterous hands to assemble model cars, operate syringes, sort poker cards, fold/roll shirts, all learned primarily from 20,000+ hours of egocentric human video with no robot in the loop. Humans are the most scalable embodiment on the planet. We discovered a near-perfect log-linear scaling law (R² = 0.998) between human video volume and action prediction loss, and this loss directly predicts real-robot success rate. Humanoid robots will be the end game, because they are the practical form factor with minimal embodiment gap from humans. Call it the Bitter Lesson of robot hardware: the kinematic similarity lets us simply retarget human finger motion onto dexterous robot hand joints. No learned embeddings, no fancy transfer algorithms needed. Relative wrist motion + retargeted 22-DoF finger actions serve as a unified action space that carries through from pre-training to robot execution. Our recipe is called "EgoScale": - Pre-train GR00T N1.5 on 20K hours of human video, mid-train with only 4 hours (!) of robot play data with Sharpa hands. 54% gains over training from scratch across 5 highly dexterous tasks. - Most surprising result: a *single* teleop demo is sufficient to learn a never-before-seen task. Our recipe enables extreme data efficiency. - Although we pre-train in 22-DoF hand joint space, the policy transfers to a Unitree G1 with 7-DoF tri-finger hands. 30%+ gains over training on G1 data alone. The scalable path to robot dexterity was never more robots. It was always us. Deep dives in thread:

We trained a humanoid with 22-DoF dexterous hands to assemble model cars, operate syringes, sort poker cards, fold/roll shirts, all learned primarily from 20,000+ hours of egocentric human video with no robot in the loop. Humans are the most scalable embodiment on the planet. We discovered a near-perfect log-linear scaling law (R² = 0.998) between human video volume and action prediction loss, and this loss directly predicts real-robot success rate. Humanoid robots will be the end game, because they are the practical form factor with minimal embodiment gap from humans. Call it the Bitter Lesson of robot hardware: the kinematic similarity lets us simply retarget human finger motion onto dexterous robot hand joints. No learned embeddings, no fancy transfer algorithms needed. Relative wrist motion + retargeted 22-DoF finger actions serve as a unified action space that carries through from pre-training to robot execution. Our recipe is called "EgoScale": - Pre-train GR00T N1.5 on 20K hours of human video, mid-train with only 4 hours (!) of robot play data with Sharpa hands. 54% gains over training from scratch across 5 highly dexterous tasks. - Most surprising result: a single teleop demo is sufficient to learn a never-before-seen task. Our recipe enables extreme data efficiency. - Although we pre-train in 22-DoF hand joint space, the policy transfers to a Unitree G1 with 7-DoF tri-finger hands. 30%+ gains over training on G1 data alone. The scalable path to robot dexterity was never more robots. It was always us. Deep dives in thread:

Jim Fan

293,383 views • 4 months ago

Open-source dexterous hands with fingertip sensors! 🪬 ORCA Dexterity just released three dexterous hand models. Their mission: is to democratize robotic hand dexterity. They're sharing progress on exceptional, low-cost hardware and a software layer from low-level control to robotic hand learning. This is the open-source approach to dexterous manipulation. It has 83 taxels per finger with 0.1 N force detection which pretty is impressive for an open-source design. Tactile sensing is critical for dexterous manipulation, knowing contact forces enables gentle grasping, slip detection, and force-controlled assembly. Also 700g weight for the lite version makes it practical for mounting on robot arms without exceeding payload limits. Lower weight means faster movements and lower torque requirements. Open hardware accelerates robotics by letting researchers and builders modify designs for their specific needs without starting from scratch! ~~ ♻️ Join the weekly robotics newsletter, and never miss any news →

Open-source dexterous hands with fingertip sensors! 🪬 ORCA Dexterity just released three dexterous hand models. Their mission: is to democratize robotic hand dexterity. They're sharing progress on exceptional, low-cost hardware and a software layer from low-level control to robotic hand learning. This is the open-source approach to dexterous manipulation. It has 83 taxels per finger with 0.1 N force detection which pretty is impressive for an open-source design. Tactile sensing is critical for dexterous manipulation, knowing contact forces enables gentle grasping, slip detection, and force-controlled assembly. Also 700g weight for the lite version makes it practical for mounting on robot arms without exceeding payload limits. Lower weight means faster movements and lower torque requirements. Open hardware accelerates robotics by letting researchers and builders modify designs for their specific needs without starting from scratch! ~~ ♻️ Join the weekly robotics newsletter, and never miss any news →

Lukas Ziegler

38,094 views • 4 months ago

Tired of teleoperation? One human video → 1,000s of robot demos. (📍GitHub ) Scaling Robot Data Without Dynamics Simulation or Robot Hardware Real2Render2Real (R2R2R) is a new way to scale robot data without physics simulation or hardware. You take a phone scan + a single monocular human demo. It tracks the motion, renders photorealistic scenes, and generates diverse, robot-agnostic trajectories ready for training. > No teleop, no sim, no robot, just a phone and a video > Train VLA models and diffusion policies directly on the output > Supports multiple robot embodiments with kinematic consistency > 1000s of demos in 1/27 the time of real-world collection Thank you, Max Fu, for sharing!! Project: Paper: Code coming soon: It shows that with the right pipeline, you can scale robot learning data without touching a robot. One of the most interesting directions in scalable robotics today. —— Weekly robotics and AI insights. Subscribe free:

Tired of teleoperation? One human video → 1,000s of robot demos. (📍GitHub ) Scaling Robot Data Without Dynamics Simulation or Robot Hardware Real2Render2Real (R2R2R) is a new way to scale robot data without physics simulation or hardware. You take a phone scan + a single monocular human demo. It tracks the motion, renders photorealistic scenes, and generates diverse, robot-agnostic trajectories ready for training. > No teleop, no sim, no robot, just a phone and a video > Train VLA models and diffusion policies directly on the output > Supports multiple robot embodiments with kinematic consistency > 1000s of demos in 1/27 the time of real-world collection Thank you, Max Fu, for sharing!! Project: Paper: Code coming soon: It shows that with the right pipeline, you can scale robot learning data without touching a robot. One of the most interesting directions in scalable robotics today. —— Weekly robotics and AI insights. Subscribe free:

Ilir Aliu

42,804 views • 5 months ago

What if robots could learn real-world tasks from your perspective… without ever touching a robot? This is a system that trains robot policies using nothing but human-first, egocentric video data from smart glasses. No robots, no teleop, no sensors, just humans doing real tasks in the real world. Why it matters ✅ Learns robot policies from 20 minutes of human video; zero robot demos ✅ Generalizes to new objects, views, and even robot morphologies ✅ Uses 3D points for interpretable, spatially grounded learning ✅ Deploys directly to real-world robots with strong zero-shot success Thank you, Vincent Liu, for sharing!!! Learn more here: 🔗 Paper: 🌐 Website: 📍 BOOKMARK FOR LATER

What if robots could learn real-world tasks from your perspective… without ever touching a robot? This is a system that trains robot policies using nothing but human-first, egocentric video data from smart glasses. No robots, no teleop, no sensors, just humans doing real tasks in the real world. Why it matters ✅ Learns robot policies from 20 minutes of human video; zero robot demos ✅ Generalizes to new objects, views, and even robot morphologies ✅ Uses 3D points for interpretable, spatially grounded learning ✅ Deploys directly to real-world robots with strong zero-shot success Thank you, Vincent Liu, for sharing!!! Learn more here: 🔗 Paper: 🌐 Website: 📍 BOOKMARK FOR LATER

Ilir Aliu - eu/acc

10,509 views • 1 year ago

A robot hand grasp over 500 totally new objects without fail? Zero-shot, single-view & super reliable ⬇️ + Paper Grasping random objects is hard for robots, especially when shapes, weights, and materials vary. RobustDexGrasp solves this with a smart new way of seeing and controlling the hand, leading to near-perfect grasping, even in noisy or cluttered scenes. Thank you for sharing, Hui Zhang 🙏 Follow him!! What makes it special ✅ Grabs 500+ unseen objects with 94.6% success using only single-view input ✅ Learns local shapes, not full geometry, for better generalization ✅ Trained with just 35 objects in sim but works in the real world with hundreds more ✅ Adapts to noise, unexpected forces, and even plays chess with VLM planning It shows that smart sensing and adaptive control can take dexterous grasping to the next level. Project: Paper:

A robot hand grasp over 500 totally new objects without fail? Zero-shot, single-view & super reliable ⬇️ + Paper Grasping random objects is hard for robots, especially when shapes, weights, and materials vary. RobustDexGrasp solves this with a smart new way of seeing and controlling the hand, leading to near-perfect grasping, even in noisy or cluttered scenes. Thank you for sharing, Hui Zhang 🙏 Follow him!! What makes it special ✅ Grabs 500+ unseen objects with 94.6% success using only single-view input ✅ Learns local shapes, not full geometry, for better generalization ✅ Trained with just 35 objects in sim but works in the real world with hundreds more ✅ Adapts to noise, unexpected forces, and even plays chess with VLM planning It shows that smart sensing and adaptive control can take dexterous grasping to the next level. Project: Paper:

Ilir Aliu

37,980 views • 1 year ago

Open-source robot arm meets hand tracking [📍GitHub below] It is designed with an industrial mindset but built as a 3D-printed desktop system. PAROL6 paired with a LEAP Motion controller is a nice example of how accessible robot teleoperation has become. • Hand motion is streamed to the robot at 100 Hz via UDP • A pneumatic gripper is controlled by simple fist open and close gestures • The entire robot stack is open source, from mechanics to control software Combine that with low-latency hand tracking and you get a very practical platform for learning manipulation, teleoperation, and human-robot interfaces. This kind of setup is great for experimentation, teleop, data collection, and teaching robots by demonstration All without proprietary hardware or locked software. Credit to SourceRobotics 📍Code: —— Weekly robotics and AI insights. Subscribe free:

Open-source robot arm meets hand tracking [📍GitHub below] It is designed with an industrial mindset but built as a 3D-printed desktop system. PAROL6 paired with a LEAP Motion controller is a nice example of how accessible robot teleoperation has become. • Hand motion is streamed to the robot at 100 Hz via UDP • A pneumatic gripper is controlled by simple fist open and close gestures • The entire robot stack is open source, from mechanics to control software Combine that with low-latency hand tracking and you get a very practical platform for learning manipulation, teleoperation, and human-robot interfaces. This kind of setup is great for experimentation, teleop, data collection, and teaching robots by demonstration All without proprietary hardware or locked software. Credit to SourceRobotics 📍Code: —— Weekly robotics and AI insights. Subscribe free:

Ilir Aliu

32,025 views • 6 months ago

Marques - You’re vastly under estimating the fixed cost of building any robot With such a large cost, the robot has to be general purpose We designed the world for our bodies and hands - the human form is the only universal UI Marques Brownlee

Marques - You’re vastly under estimating the fixed cost of building any robot With such a large cost, the robot has to be general purpose We designed the world for our bodies and hands - the human form is the only universal UI Marques Brownlee

Brett Adcock

295,166 views • 11 months ago

BCI company BrainCo just released a new dexterous hand: Revo3. But this time, it’s not for humans, it’s built for humanoid robots. ＞21 DOF ＞Full-palm + fingertip tactile sensing ＞Direct drive + backdrivable design ＞33 grasp types ＞20N fingertip pinch force ＞Open-source ecosystem, one-click deployment It looks bigger and heavier than the previous versions, but clearly designed for real robot work:picking up objects, using tools, and performing tasks... btw, what do you think are the real differences between the hands of humanoid robots and the bionic hands used by humans?

BCI company BrainCo just released a new dexterous hand: Revo3. But this time, it’s not for humans, it’s built for humanoid robots. ＞21 DOF ＞Full-palm + fingertip tactile sensing ＞Direct drive + backdrivable design ＞33 grasp types ＞20N fingertip pinch force ＞Open-source ecosystem, one-click deployment It looks bigger and heavier than the previous versions, but clearly designed for real robot work:picking up objects, using tools, and performing tasks... btw, what do you think are the real differences between the hands of humanoid robots and the bionic hands used by humans?

CyberRobo

71,783 views • 3 months ago

Feeling what a robot feels is becoming real! 🫰 Fluid Reality just ran its first full end-to-end touch teleoperation system, and the demo shows why high-resolution haptics might be the missing piece in robot control. A 22 mm fingertip display with 32 independently actuated “bubbles” was mapped directly to tactile sensors on a robot hand. 🫧 Three fingers streamed real-time touch data robot → human, giving the operator actual contact feedback instead of waving their hands blind, like most teleop systems today. It’s still early, but this is a real step toward dexterous ops and high-quality data for training Physical AI systems. Companies involved: Samsung Electronics, Alt-Bionics, Inc. (the robot hand), MANUS™ (hand tracking), and Sensible Robotics (fingertip sensors) Touch isn’t a nice-to-have in robotics. It’s the difference between manipulation that works in the real world P.S. Guess how many touch receptors we humans, have in our hand! :) ~~ ♻ Join the weekly robotics newsletter, and never miss any news →

Feeling what a robot feels is becoming real! 🫰 Fluid Reality just ran its first full end-to-end touch teleoperation system, and the demo shows why high-resolution haptics might be the missing piece in robot control. A 22 mm fingertip display with 32 independently actuated “bubbles” was mapped directly to tactile sensors on a robot hand. 🫧 Three fingers streamed real-time touch data robot → human, giving the operator actual contact feedback instead of waving their hands blind, like most teleop systems today. It’s still early, but this is a real step toward dexterous ops and high-quality data for training Physical AI systems. Companies involved: Samsung Electronics, Alt-Bionics, Inc. (the robot hand), MANUS™ (hand tracking), and Sensible Robotics (fingertip sensors) Touch isn’t a nice-to-have in robotics. It’s the difference between manipulation that works in the real world P.S. Guess how many touch receptors we humans, have in our hand! :) ~~ ♻ Join the weekly robotics newsletter, and never miss any news →

Lukas Ziegler

46,829 views • 7 months ago

A policy that teaches robot hands to touch things the way humans do... not just grab and move, but feel and adjust in real time. Robot manipulation research often stops at picking up objects and placing them. CGP goes further: it handles tasks like opening jars, flipping objects in-hand, wiping dishes, and grasping fragile eggs, the kind of dexterous, contact-rich skills that require constant micro-adjustments based on what the fingers are actually feeling. The robot doesn't just see what it's doing; it predicts what contact should feel like at each step, then checks whether reality matches the prediction. If a finger is slipping, the policy knows before the object drops. Works on real robot hands (both 4-finger and 5-finger designs) with tactile sensors embedded in the fingertips Robust to visual distractions! The robot keeps flipping a box correctly even when the camera view is disrupted, because it's grounding decisions in touch, not just vision. Baseline policies without contact grounding fail in predictable ways: slipping mid-task, incomplete motions, loss of grasp, CGP avoids these This is a meaningful step toward robots that can handle the physical world with the kind of reliable, adaptive grip that humans take for granted. Relevant for manufacturing, logistics, assistive robotics, and anywhere fragile or irregular objects need to be handled carefully. Published at RSS 2026, developed with Meta Reality Labs Research. Thanks for sharing, Zhengtong Xu / Zhengtong Xu ——- Weekly robotics and AI insights. Subscribe free:

A policy that teaches robot hands to touch things the way humans do... not just grab and move, but feel and adjust in real time. Robot manipulation research often stops at picking up objects and placing them. CGP goes further: it handles tasks like opening jars, flipping objects in-hand, wiping dishes, and grasping fragile eggs, the kind of dexterous, contact-rich skills that require constant micro-adjustments based on what the fingers are actually feeling. The robot doesn't just see what it's doing; it predicts what contact should feel like at each step, then checks whether reality matches the prediction. If a finger is slipping, the policy knows before the object drops. Works on real robot hands (both 4-finger and 5-finger designs) with tactile sensors embedded in the fingertips Robust to visual distractions! The robot keeps flipping a box correctly even when the camera view is disrupted, because it's grounding decisions in touch, not just vision. Baseline policies without contact grounding fail in predictable ways: slipping mid-task, incomplete motions, loss of grasp, CGP avoids these This is a meaningful step toward robots that can handle the physical world with the kind of reliable, adaptive grip that humans take for granted. Relevant for manufacturing, logistics, assistive robotics, and anywhere fragile or irregular objects need to be handled carefully. Published at RSS 2026, developed with Meta Reality Labs Research. Thanks for sharing, Zhengtong Xu / Zhengtong Xu ——- Weekly robotics and AI insights. Subscribe free:

Ilir Aliu

12,769 views • 1 month ago

🇨🇦 ROBOT HANDS JUST LEVELED UP IN CANADA Sanctuary AI showed off a robotic hand that uses hydraulic power to grip, twist, and move like a real one. The hand can handle delicate objects, like dice, without crushing them, proving robots are getting scary precise. Hydraulic actuation gives the robot smoother, stronger control, making it better for heavy industrial work and tricky small tasks. This could change how robots work in factories, warehouses, or even in jobs too dangerous for humans. The future of robot hands? Less clunky claws, more human-like grip power with machine muscle. Source: Wevolver

🇨🇦 ROBOT HANDS JUST LEVELED UP IN CANADA Sanctuary AI showed off a robotic hand that uses hydraulic power to grip, twist, and move like a real one. The hand can handle delicate objects, like dice, without crushing them, proving robots are getting scary precise. Hydraulic actuation gives the robot smoother, stronger control, making it better for heavy industrial work and tricky small tasks. This could change how robots work in factories, warehouses, or even in jobs too dangerous for humans. The future of robot hands? Less clunky claws, more human-like grip power with machine muscle. Source: Wevolver

Mario Nawfal

50,661 views • 11 months ago