
Xiao Ma
@yusufma555 • 1,632 subscribers
Research Scientist @ ByteDance Seed. Prev: @SeaAIL @NUSingapore @sjtu1896. All views are my own.
Videos

I've been working on deformable object manipulation since my PhD. It was totally a nightmare years ago and my PhD advisor was telling me not to work on it for my own good. Today, at ByteDance Seed, we are dropping GR-RL, a new VLA+RL system that manages long-horizon precise dexterous manipulation of deformable objects. This is probably the first real-world RL system to make a robot: ✅ Lace up your shoes end to end ✅ Hit millimeter tolerance repeatedly ✅ Recover from mistakes (See video!) ✅ And complete continuous shoelace threading on a real bimanual platform 📈 Success rate: ↑ from 45.7% → 83.3% Yes, robots can now actually do this. Project page: ArXiv:
Xiao Ma109,462 görüntüleme • 6 ay önce

Scaling vision-language-action (VLA) models to high-DoF dexterous hands has long been a "holy grail" challenge due to the high-dimensional action space and data scarcity. As a wrap up of the year 2025, we are releasing GR-Dexter, a holistic hardware-model-data framework for generalist manipulation on a bimanual dexterous-hand robot. This is the first VLA system to achieve: ✅ High-DoF Control: Managing a 56-DoF bimanual system (21-DoF per hand). ✅ Long-Horizon Tasks with tool use: Vacuuming, bread serving with tongs, and table decluttering. ✅ Open-World Generalization: Robust performance with unseen objects and abstract instructions. Project page: ArXiv:
Xiao Ma93,692 görüntüleme • 5 ay önce

🚀🚀🚀 Ever wondered what it takes for robots to handle real-world household tasks? long-horizon execution, deformable object dexterity, and unseen object generalization — meet GR-3, ByteDance Seed’s new Vision-Language-Action (VLA) model! GR-3 is a generalizable Vision-Language-Action (VLA) model with strong capabilities in complex long-horizon tasks. It understands unseen abstract concepts, manipulates deformable objects robustly, and adapts to novel settings with minimal human data. ✨ Generalization: Generalizes well to unseen objects, environments, and even instructions with abstract concepts. ✨ Long-Horizon Manipulation: Completes long-horizon tasks with strong instruction-following capabilities. ✨ Deformable Object Manipulation: Manipulate deformable objects robustly. Project Page: Arxiv: #ByteDance #ByteDanceSeed #GR3 #VLA #Robotics #FoundationModels
Xiao Ma46,260 görüntüleme • 11 ay önce
Daha fazla içerik yok.