Loading video...

Video Failed to Load

Go Home

Today at Meta FAIR we’re announcing three new cutting-edge developments in robotics and touch perception — and releasing a collection of artifacts to empower the community to build on this work. Details on all of this new work ➡️ 1️⃣ Meta Sparsh is the first general-purpose encoder for vision-based...

453,035 views • 1 year ago •via X (Twitter)

10 Comments

AI at Meta's profile picture
AI at Meta1 year ago

To make these advancements more accessible for different applications, we’re partnering with @GelSight and Wonik Robotics to develop and commercialize these touch-sensing innovations. We’re excited about how this will enable the community to contribute and drive progress in this space.

AI at Meta's profile picture
AI at Meta1 year ago

Additionally, looking towards the future, we’re releasing PARTNR: a benchmark for Planning And Reasoning Tasks in humaN-Robot collaboration. Built on Habitat 3.0, it’s the largest benchmark of its kind to study and evaluate human-robot collaboration in household activities By providing a standardized benchmark and dataset we hope to enable new research on robots that can not only operate in isolation, but in collaboration with people. Details and code ➡️

maharshi's profile picture
maharshi1 year ago

insane, love the name as well: sparsh (in hindi) literally translates to “touch” we need more hindi names :)

XENOWHITE's profile picture
XENOWHITE1 year ago

Robotics research that is open source too? Holy shit I love you guys

Tony Jose Matos's profile picture
Tony Jose Matos1 year ago

::pokes you::

bone's profile picture
bone1 year ago

how long until this

BensenHsu's profile picture
BensenHsu1 year ago

Meta Sparsh: The paper introduces a family of general-purpose touch representations called "Sparsh" that are trained using self-supervised learning (SSL) techniques. The authors aim to develop touch representations that can work well across various vision-based tactile sensors and tasks, without the need for extensive labeled data. The authors find that the "Sparsh" representations, especially those trained using DINO and IJEPA, outperform task and sensor-specific end-to-end models by 95.1% on average across the "TacBench" tasks, when using limited labeled data (33-50%). "Sparsh" representations show strong performance in tasks like force estimation, slip detection, pose estimation, and grasp stability, even with as little as 10-33% of the labeled data. full paper:

$Q*🍓on Ethereum's profile picture
$Q*🍓on Ethereum1 year ago

When are the metabots coming?

Aditya Kumar Saroj's profile picture
Aditya Kumar Saroj1 year ago

Is it just me or y'all realize this is some groundbreaking stuff?

AI For Humans Show's profile picture
AI For Humans Show1 year ago

this is so cool -- excited to learn more about it

Related Videos

Open science is how we continue to push technology forward and today at Meta FAIR we’re sharing eight new AI research artifacts including new models, datasets and code to inspire innovation in the community. More in the video from Joelle Pineau. This work is another important step towards our goal of achieving Advanced Machine Intelligence (AMI). What we’re releasing: • Meta Spirit LM: An open source language model for seamless speech and text integration. • Meta Segment Anything Model 2.1: An updated checkpoint with improved results on visually similar objects, small objects and occlusion handling. Plus a new developer suite to make it easier for developers to build with SAM 2. • Layer Skip: Inference code and fine-tuned checkpoints demonstrating a new method for enhancing LLM performance. • SALSA: New code to enable researchers to benchmark AI-based attacks in support of validating security for post-quantum cryptography. • Meta Lingua: A lightweight and self-contained codebase designed to train language models at scale. • Meta Open Materials: New open source models and the largest dataset of its kind to accelerate AI-driven discovery of new inorganic materials. • MEXMA: A new research paper and code for our novel pre-trained cross-lingual sentence encoder with coverage across 80 languages. • Self-Taught Evaluator: a new method for generating synthetic preference data to train reward models without relying on human annotations. Access to state-of-the-art AI creates opportunities for everyone. We’re excited to share this work and look forward to seeing the community innovation that results from it. Details and access to everything released by FAIR today ➡️

AI at Meta

150,222 views • 1 year ago

I was really impressed by the UMI gripper (Cheng Chi et al.), but a key limitation is that **force-related data wasn’t captured**: humans feel haptic feedback through the mechanical springs, but the robot couldn’t leverage that info, limiting the data’s value for fine-grained manipulation tasks. Led by my amazing students Yolanda Zhu and Binghao Huang, we designed a **portable visuo-tactile gripper** by integrating our dense, flexible tactile arrays with the UMI gripper to enable large-scale in-the-wild data collection. 🔗 We demonstrate **cross-modal representation learning** and **downstream policy learning** on tasks requiring in-hand state estimation (e.g., test tube reorientation) and fine-grained force sensing (e.g., pipette fluid transfer). Key takeaways: - Our flexible tactile arrays store the rich haptic information humans perceive as dense tactile signals. - Portability and robustness are key for in-the-wild data collection; our portable gripper is compact, lightweight, and durable. - Touch provides precise, robust measurements of in-hand object pose, invariant to lighting and viewpoint. - Cross-modal pretraining on large-scale in-the-wild data significantly improves policy robustness and sample efficiency (as shown many times before — and verified again here!). Also check out our previous investigations of dense, flexible tactile grids for understanding human-robot-environment interactions: - Dense tactile glove (Nature ’19): - 3D-ViTac (CoRL ’24):

Yunzhu Li

13,188 views • 11 months ago