正在加载视频...

视频加载失败

Today at Meta FAIR we’re announcing three new cutting-edge developments in robotics and touch perception — and releasing a collection of artifacts to empower the community to build on this work. Details on all of this new work ➡️ 1️⃣ Meta Sparsh is the first general-purpose encoder for vision-based...

453,035 次观看 • 1 年前 •via X (Twitter)

10 条评论

AI at Meta 的头像
AI at Meta1 年前

To make these advancements more accessible for different applications, we’re partnering with @GelSight and Wonik Robotics to develop and commercialize these touch-sensing innovations. We’re excited about how this will enable the community to contribute and drive progress in this space.

AI at Meta 的头像
AI at Meta1 年前

Additionally, looking towards the future, we’re releasing PARTNR: a benchmark for Planning And Reasoning Tasks in humaN-Robot collaboration. Built on Habitat 3.0, it’s the largest benchmark of its kind to study and evaluate human-robot collaboration in household activities By providing a standardized benchmark and dataset we hope to enable new research on robots that can not only operate in isolation, but in collaboration with people. Details and code ➡️

maharshi 的头像
maharshi1 年前

insane, love the name as well: sparsh (in hindi) literally translates to “touch” we need more hindi names :)

XENOWHITE 的头像
XENOWHITE1 年前

Robotics research that is open source too? Holy shit I love you guys

Tony Jose Matos 的头像
Tony Jose Matos1 年前

::pokes you::

bone 的头像
bone1 年前

how long until this

BensenHsu 的头像
BensenHsu1 年前

Meta Sparsh: The paper introduces a family of general-purpose touch representations called "Sparsh" that are trained using self-supervised learning (SSL) techniques. The authors aim to develop touch representations that can work well across various vision-based tactile sensors and tasks, without the need for extensive labeled data. The authors find that the "Sparsh" representations, especially those trained using DINO and IJEPA, outperform task and sensor-specific end-to-end models by 95.1% on average across the "TacBench" tasks, when using limited labeled data (33-50%). "Sparsh" representations show strong performance in tasks like force estimation, slip detection, pose estimation, and grasp stability, even with as little as 10-33% of the labeled data. full paper:

$Q*🍓on Ethereum 的头像
$Q*🍓on Ethereum1 年前

When are the metabots coming?

Aditya Kumar Saroj 的头像
Aditya Kumar Saroj1 年前

Is it just me or y'all realize this is some groundbreaking stuff?

AI For Humans Show 的头像
AI For Humans Show1 年前

this is so cool -- excited to learn more about it

相关视频

Open science is how we continue to push technology forward and today at Meta FAIR we’re sharing eight new AI research artifacts including new models, datasets and code to inspire innovation in the community. More in the video from Joelle Pineau. This work is another important step towards our goal of achieving Advanced Machine Intelligence (AMI). What we’re releasing: • Meta Spirit LM: An open source language model for seamless speech and text integration. • Meta Segment Anything Model 2.1: An updated checkpoint with improved results on visually similar objects, small objects and occlusion handling. Plus a new developer suite to make it easier for developers to build with SAM 2. • Layer Skip: Inference code and fine-tuned checkpoints demonstrating a new method for enhancing LLM performance. • SALSA: New code to enable researchers to benchmark AI-based attacks in support of validating security for post-quantum cryptography. • Meta Lingua: A lightweight and self-contained codebase designed to train language models at scale. • Meta Open Materials: New open source models and the largest dataset of its kind to accelerate AI-driven discovery of new inorganic materials. • MEXMA: A new research paper and code for our novel pre-trained cross-lingual sentence encoder with coverage across 80 languages. • Self-Taught Evaluator: a new method for generating synthetic preference data to train reward models without relying on human annotations. Access to state-of-the-art AI creates opportunities for everyone. We’re excited to share this work and look forward to seeing the community innovation that results from it. Details and access to everything released by FAIR today ➡️

AI at Meta

150,222 次观看 • 1 年前

I was really impressed by the UMI gripper (Cheng Chi et al.), but a key limitation is that **force-related data wasn’t captured**: humans feel haptic feedback through the mechanical springs, but the robot couldn’t leverage that info, limiting the data’s value for fine-grained manipulation tasks. Led by my amazing students Yolanda Zhu and Binghao Huang, we designed a **portable visuo-tactile gripper** by integrating our dense, flexible tactile arrays with the UMI gripper to enable large-scale in-the-wild data collection. 🔗 We demonstrate **cross-modal representation learning** and **downstream policy learning** on tasks requiring in-hand state estimation (e.g., test tube reorientation) and fine-grained force sensing (e.g., pipette fluid transfer). Key takeaways: - Our flexible tactile arrays store the rich haptic information humans perceive as dense tactile signals. - Portability and robustness are key for in-the-wild data collection; our portable gripper is compact, lightweight, and durable. - Touch provides precise, robust measurements of in-hand object pose, invariant to lighting and viewpoint. - Cross-modal pretraining on large-scale in-the-wild data significantly improves policy robustness and sample efficiency (as shown many times before — and verified again here!). Also check out our previous investigations of dense, flexible tactile grids for understanding human-robot-environment interactions: - Dense tactile glove (Nature ’19): - 3D-ViTac (CoRL ’24):

Yunzhu Li

13,188 次观看 • 11 个月前