
Siyuan Huang
@siyuanhuang95 • 5,004 subscribers
Research Scientist at BIGAI, Director of Center for Embodied AI and Robotics. Ph.D. in Statistics from @UCLA. Former intern at @DeepMind and @MetaAI.
Shorts
Videos

The CEO of Unitree, XingXing Wang, posted a dancing video at Rednote against the hype that the previous dance video was AI- or CG- generated. The dance is performed before a mirror and with sound, which makes it 100% real. Really cool demo! #Unitree #Humanoid #RobotDance
Siyuan Huang1,316,929 次观看 • 1 年前

You might have seen the WuBOT performing at the 2026 Spring Festival Gala; however, most high-dynamic extreme motions you see are executed by overfitted tracking policies. Until now, training a unified policy capable of performing various extreme motions with a high success rate remained an unsolved challenge. We spent an entire year digging into the barrier between general tracking and extreme physical behaviors. After burning through dozens of G1 robots, we finally identified the bottleneck of learning and physical executability. With these discoveries, we developed OmniXtreme: the first general policy that can execute diverse extreme motions, including consecutive flips, extreme balancing, and even breakdancing with rapid contact switches! This capability is achieved by pre-training a flow-based generative control policy and then post-training with actuation-aware residual RL for complex physical dynamics—a step we found critical for successful real-world transfer. This work is a joint collaboration with Unitree. Together, we are pushing the physical limits of humanoid robots. It is incredibly exciting to see a general "robot gymnast" and "robot breakdancer" come to life! It was also our first time publishing a paper with XingXing, which was an enlightening experience. The model checkpoints are now released—we welcome you to play with them! 📦 📄 Paper: 🌐 Project: 💻 Code:
Siyuan Huang106,266 次观看 • 3 个月前

🎉🎉🎉 We won the champion in the solo dance contest at the first World Humanoid Robot Games, partner with Unitree ! Here is the full video! Training the robot to perform a long-term dancing (2:30 mins) with stability, smoothness, and agility is much more challenging than we expected. The robot needs to dance with the rhythm, keep global position, move dynamically and cannot fall. You cannot cherry pick on the playing field. More technical details will be released in the future.
Siyuan Huang115,911 次观看 • 9 个月前

Scaling 3D scene data is a long-standing challenge in scene understanding, spatial reasoning, and robotics. Since scanning, reconstruction, and labeling are so labor-intensive, data scarcity has remained a major bottleneck. 🛑 To solve this, we propose SceneVerse++: Lifting Unlabeled Internet-level Data for 3D Scene Understanding (CVPR 2026). By reconstructing internet videos and annotating 3D scenes automatically, we’ve created a massive real-world dataset for end-to-end understanding. 🌐📐 SceneVerse++ makes it easy to scale "in-the-wild" 3D scenes toward more capable spatial reasoning systems. This significantly promotes progress in 3D VQA, visual navigation, and broader tasks in Embodied AI and Robotics. 🤖🦾 We are fully open-sourced! Check out the paper, code, and data here: 🌐 Project: 📄 Paper: 📊 Dataset: Code:
Siyuan Huang12,433 次观看 • 1 个月前

🤖 Ever dreamed of controlling a humanoid robot to perform complex, long-horizon tasks — using just a single Vision Pro? 🎉 Meet CLONE: a holistic, closed-loop, whole-body teleoperation system for long-horizon humanoid control! 🏃♂️🧍 CLONE enables rich and coordinated interactive tasks: 🥊 boxing 🏓 table tennis 🤲 object pickup 📦 room arrangement 🤝 handover … and more! 🌀 Our closed-loop error correction powered by LiDAR odometry ensures precision, while motion-captured demonstrations supercharge policy learning — unlocking the full potential of the G1 robot. 🎥 It’s hard to squeeze the magic into 1 minute — check out the full video demo and project page here: 🔗
Siyuan Huang66,556 次观看 • 1 年前

🥰Super excited that SceneWeaver ( won the best paper award at the IROS25 RoboGen workshop. SceneWeaver provides an agentic framework for tool-based 3D scene generation, given a language description as input, you can generate or edit a corresponding details with lots of details.
Siyuan Huang43,681 次观看 • 7 个月前

Excited to introduce COLA: Learning Human–Humanoid Coordination for Collaborative Object Carrying 🤝🤖 COLA makes humanoids truly helpful in human collaboration — capable of carrying objects, pushing carts, or responding to human push commands. It provides a proprioception-only policy for compliant human–humanoid coordination across diverse movement patterns. The core idea is simple yet effective: 👉 Fine-tune a collaborative policy from a locomotion policy using a residual teacher. 👉 Train in simulation and distill to a real-world student policy for deployment. Paper: Project:
Siyuan Huang21,892 次观看 • 7 个月前

🎉🎉🎉Super excited to announce that we launched a joint lab of Embodied AI and Humanoid Robot between UniTree and BIGAI! I will direct the lab to foster the research in integrating the 3D scene understanding capability to Humanoid robot. Hope to see more generalist Humanoid robots in our home very soon😁
Siyuan Huang42,117 次观看 • 1 年前

📢📢📢 Excited to release ManipTrans: Efficient Dexterous Bimanual Manipulation Transfer via Residual Learning (CVPR25). 🤏🤙✌️With ManipTrans, we can transfer dexterous manipulation skills into robotic hands in simulation and deploy them on a real robot, using a residual policy learned for dex manipulation. 🤖🤖🤖The video below illustrates how the MoCap data can be transferred to Inspire, Shadow, Xhand, Allegro, and Mano. With ManipTrans, we can scale up dex manip data greatly with minimal effort. For more details, please check our -webpage: -code: -huggingface:
Siyuan Huang20,887 次观看 • 1 年前

🎤🎤 Excited to introduce COME-robot🤖🤖, Closed-Loop Open-Vocabulary Mobile Manipulation with GPT-4V. It is the first closed-loop framework utilizing the vision-language foundation model for open-ended reasoning and adaptive planning in real-world scenarios. COME-robot demonstrates a significant improvement in task success rate (~25%) compared to SOTA methods. Project: Arxiv:
Siyuan Huang22,291 次观看 • 2 年前

Big thanks to AK for highlighting our work! LEO marks our pioneering step towards building an embodied generalist agent that can really comprehend the 3D world! 🚀Leveraging LLMs, we train LEO with real and synthetic 3D data across a diverse spectrum of tasks. It's thrilling to see LEO surpass current state-of-the-art SOTA methods in most benchmarked tasks, all under a single, unified model. 🔥 #Generalist_Agent
Siyuan Huang22,710 次观看 • 2 年前

🤖🤖🤖 Following RoboVerse, we introduce another work focused on Robotic Tactile Simulation - Taccel Simulator. Taccel is a high-performance simulation platform for vision-based tactile sensors and robots. 🚀🚀🚀 Boosted by Nvidia Warp, we optimize Taccel with highly parallelized simulations and support 900fps simulation with 4k+ parallel training envs. 🤝🤝🤝 Taccel is designed with user-friendly APIs and is easy to use. We open-sourced all the code and documentation. Feel free to try! Project: Preprint: Code:
Siyuan Huang10,650 次观看 • 1 年前

📢📢📢 Excited to share our new work *Autonomous Character-Scene Interaction Synthesis from Text Instruction* (Siggraph Asia 24). It presents a unified model for flexible scene-conditioned motion generation given text, scene, trajectory conditions. The results with smooth interaction look very impressive! 📰Paper: Project: Code and data will be released soon.
Siyuan Huang11,340 次观看 • 1 年前
没有更多内容可加载