Загрузка видео...

Не удалось загрузить видео

Возникла проблема при загрузке этого видео. Это может быть связано с временными проблемами сети или видео может быть недоступно.

На главную

Today, we publicly released RoboCasa365, a large-scale simulation benchmark for training and systematically evaluating generalist robot models. Built upon our original RoboCasa framework, it offers: • 2,500 realistic kitchen environments; • 365 everyday tasks (basic skills + long-horizon mobile manipulation); • Over 3,200 objects with many articulated fixtures/appliances. All... are designed for fully controlled, reproducible benchmarking of robotic policies. Progress in robotic foundation models is real. But it’s still hard to answer basic questions like: How close are we to general-purpose autonomy? What factors drive generalization? What are the model/data scaling curves like? Real-world eval is slow and noisy, and existing sims (like LIBERO, which we built 3 years ago) often lack sufficient task and scene diversity. This benchmark comes with 2,200+ hours of demonstrations and 500K+ trajectories to support studies of multi-task training, pretraining, and continual learning at scale. Check it out atshow more

Yuke Zhu

22,413 subscribers

23,755 просмотров • 4 месяцев назад •via X (Twitter)

Anya Rossi• Live Now

Private livecam show

Комментарии: 0

Нет доступных комментариев

Здесь появятся комментарии из оригинального поста

Похожие видео

Excited to announce RoboCasa, a large-scale simulation framework of everyday tasks! We use generative AI tools to create diverse objects, scenes, and tasks. Simulation plays a pivotal role in our Data Pyramid for training generalist robots. Open-source at

Excited to announce RoboCasa, a large-scale simulation framework of everyday tasks! We use generative AI tools to create diverse objects, scenes, and tasks. Simulation plays a pivotal role in our Data Pyramid for training generalist robots. Open-source at

Yuke Zhu

141,486 просмотров • 2 лет назад

Model shaping is still a craft of a few. That's what AI agents are for: learning it and doing it for everyone else. As a part of FrontierSWE benchmark we built a 20-hour post-training task on Tinker and found the real bottleneck is research intuition.

Model shaping is still a craft of a few. That's what AI agents are for: learning it and doing it for everyone else. As a part of FrontierSWE benchmark we built a 20-hour post-training task on Tinker and found the real bottleneck is research intuition.

Thoughtful

216,670 просмотров • 2 месяцев назад

With the recent progress in large-scale multi-task robot training, how can we advance the real-world deployment of multi-task robot fleets? Introducing Sirius-Fleet✨, a multi-task interactive robot fleet learning framework with 𝗩𝗶𝘀𝘂𝗮𝗹 𝗪𝗼𝗿𝗹𝗱 𝗠𝗼𝗱𝗲𝗹𝘀! 🌍 #CoRL2024

With the recent progress in large-scale multi-task robot training, how can we advance the real-world deployment of multi-task robot fleets? Introducing Sirius-Fleet✨, a multi-task interactive robot fleet learning framework with 𝗩𝗶𝘀𝘂𝗮𝗹 𝗪𝗼𝗿𝗹𝗱 𝗠𝗼𝗱𝗲𝗹𝘀! 🌍 #CoRL2024

Huihan Liu

28,040 просмотров • 1 год назад

A few weeks ago, we shared our progress on articulated objects and long-horizon tasks. Here are two representative examples: - We've been steadily expanding our asset library to cover more articulated objects. Articulated objects have always been a challenging asset class to handle in simulation. Interacting with them requires robots to master atomic skills such as pushing, pulling, opening, and closing, and to understand part structure, interaction constraints, and how the object moves. - Long-horizon tasks can now be generated at scale. Long-horizon tasks are the other hard category: they require chaining multiple sub-goals in sequence. A failure early in the task can cascade and make the rest unrecoverable. Axis is scaling along three dimensions at once: data volume, data quality, and task difficulty.

A few weeks ago, we shared our progress on articulated objects and long-horizon tasks. Here are two representative examples: - We've been steadily expanding our asset library to cover more articulated objects. Articulated objects have always been a challenging asset class to handle in simulation. Interacting with them requires robots to master atomic skills such as pushing, pulling, opening, and closing, and to understand part structure, interaction constraints, and how the object moves. - Long-horizon tasks can now be generated at scale. Long-horizon tasks are the other hard category: they require chaining multiple sub-goals in sequence. A failure early in the task can cascade and make the rest unrecoverable. Axis is scaling along three dimensions at once: data volume, data quality, and task difficulty.

Axis Robotics

11,191 просмотров • 3 дней назад

[1/16] The real world is large, and we want our AI models to operate on the scale of reality. Over the past two years, my colleagues and I have developed fVDB, a deep learning framework for large-scale, high performance spatial intelligence. We finally announced this work at SIGGRAPH and have released the framework in early access. Here’s a small tour of what fVDB is, and what we’ve used it to do.

[1/16] The real world is large, and we want our AI models to operate on the scale of reality. Over the past two years, my colleagues and I have developed fVDB, a deep learning framework for large-scale, high performance spatial intelligence. We finally announced this work at SIGGRAPH and have released the framework in early access. Here’s a small tour of what fVDB is, and what we’ve used it to do.

Francis Williams

33,898 просмотров • 1 год назад

🧵 Evaluating robot policies in the real world is slow, expensive, and hard to scale. During my internship at SceniX AI this summer, we had many discussions around the two key questions: how accurate must a simulator be for evaluation to be meaningful, and how do we get there? Our new framework, Real2Sim-Eval, takes a step toward that answer. By combining Gaussian Splatting for photorealistic rendering and soft-body digital twins for realistic dynamics, we make simulation predictive of real-world performance. 👉

🧵 Evaluating robot policies in the real world is slow, expensive, and hard to scale. During my internship at SceniX AI this summer, we had many discussions around the two key questions: how accurate must a simulator be for evaluation to be meaningful, and how do we get there? Our new framework, Real2Sim-Eval, takes a step toward that answer. By combining Gaussian Splatting for photorealistic rendering and soft-body digital twins for realistic dynamics, we make simulation predictive of real-world performance. 👉

Kaifeng Zhang

61,000 просмотров • 7 месяцев назад

Today, we're joined by Sergey Levine, associate professor at UC Berkeley EECS and co-founder of Physical Intelligence to discuss π0 (pi-zero), a general-purpose robotic foundation model. We dig into the model architecture, which pairs a vision language model (VLM) with a diffusion-based action expert, and the model training "recipe," emphasizing the roles of pre-training and post-training with a diverse mixture of real-world data to ensure robust and intelligent robot learning. We review the data collection approach, which uses human operators and teleoperation rigs, the potential of synthetic data and reinforcement learning in enhancing robotic capabilities, and much more. We also introduce the team’s new FAST tokenizer, which opens the door to a fully Transformer-based model and significant improvements in learning and generalization. Finally, we cover the open-sourcing of π0 and future directions for their research. 🎧 / 🎥 Listen or watch the full episode on our page: 📖 CHAPTERS =============================== 00:00 - Introduction 2:14 - Physical Intelligence 3:47 - Key challenges in robotic learning 6:13 - Reinforcement learning in π0 and robotic foundation models 8:36 - π0 VLM model architecture 15:33 - π0 model recipe 18:39 - Pre-training dataset 22:47 - Post-training 24:23 - Laundry folding demo 31:32 - Scaling laws on π0 model 34:57 - FAST 40:26 - Open sourcing π0 43:37 - Other robot types 46:27 - Future directions

Today, we're joined by Sergey Levine, associate professor at UC Berkeley EECS and co-founder of Physical Intelligence to discuss π0 (pi-zero), a general-purpose robotic foundation model. We dig into the model architecture, which pairs a vision language model (VLM) with a diffusion-based action expert, and the model training "recipe," emphasizing the roles of pre-training and post-training with a diverse mixture of real-world data to ensure robust and intelligent robot learning. We review the data collection approach, which uses human operators and teleoperation rigs, the potential of synthetic data and reinforcement learning in enhancing robotic capabilities, and much more. We also introduce the team’s new FAST tokenizer, which opens the door to a fully Transformer-based model and significant improvements in learning and generalization. Finally, we cover the open-sourcing of π0 and future directions for their research. 🎧 / 🎥 Listen or watch the full episode on our page: 📖 CHAPTERS =============================== 00:00 - Introduction 2:14 - Physical Intelligence 3:47 - Key challenges in robotic learning 6:13 - Reinforcement learning in π0 and robotic foundation models 8:36 - π0 VLM model architecture 15:33 - π0 model recipe 18:39 - Pre-training dataset 22:47 - Post-training 24:23 - Laundry folding demo 31:32 - Scaling laws on π0 model 34:57 - FAST 40:26 - Open sourcing π0 43:37 - Other robot types 46:27 - Future directions

The TWIML AI Podcast

19,942 просмотров • 1 год назад

Real-world robot data is expensive and slow to collect, creating a major challenge for humanoid development. 🤖 The NVIDIA GR00T N1.6 open vision language action model is pre-trained on a diverse mix of data, including thousands of hours of Stanford Vision and Learning Lab’s BEHAVIOR simulation data, which covers long-horizon everyday manipulation tasks. This diverse training is the key to robust cross-embodiment performance and real-world adaptability. 🌍 Read the blog 🔗

Real-world robot data is expensive and slow to collect, creating a major challenge for humanoid development. 🤖 The NVIDIA GR00T N1.6 open vision language action model is pre-trained on a diverse mix of data, including thousands of hours of Stanford Vision and Learning Lab’s BEHAVIOR simulation data, which covers long-horizon everyday manipulation tasks. This diverse training is the key to robust cross-embodiment performance and real-world adaptability. 🌍 Read the blog 🔗

NVIDIA Robotics

13,429 просмотров • 5 месяцев назад

Why do generalist robotic models fail when a cup is moved just two inches to the left? It’s not a lack of motor skill, it’s an alignment problem. Today, we introduce VLS: Vision-Language Steering of Pretrained Robot Policies, a training-free framework that guides robot behavior in real time. Check out the project: 👇🧵 (Watch till the end: VLS runs uncut, steering pretrained policies across long-horizon tasks.)

Why do generalist robotic models fail when a cup is moved just two inches to the left? It’s not a lack of motor skill, it’s an alignment problem. Today, we introduce VLS: Vision-Language Steering of Pretrained Robot Policies, a training-free framework that guides robot behavior in real time. Check out the project: 👇🧵 (Watch till the end: VLS runs uncut, steering pretrained policies across long-horizon tasks.)

Jiafei Duan

72,240 просмотров • 4 месяцев назад

📢 Announcing one of the most exciting works from us this year on **scalable robot policy evaluation through real-to-sim transfer**, moving toward a scalable evaluation engine with structured world models that capture the appearance, geometry, and dynamics of environments involving deformable objects. 🤖 Evaluation remains one of the biggest bottlenecks in building general-purpose robots. Today, robots are still evaluated only in the real world, which is **orders of magnitude slower** than the development of language agents. We propose a new framework where simulation performance **strongly correlates** with the real world (r > 0.9), even for deformable objects. The key difference from existing work lies in the correlation between simulation and reality: if a robot model performs better in the digital world, does it also perform better in the real world? This question has long made people hesitant about simulation-based evaluation — especially for deformable objects. We are changing that. Our pipeline achieves effective real-to-sim transfer, establishing **state-of-the-art correlation** between simulation and reality for deformable object manipulation. It provides a **scalable and reproducible evaluation engine** for robot learning. 🌐

📢 Announcing one of the most exciting works from us this year on scalable robot policy evaluation through real-to-sim transfer, moving toward a scalable evaluation engine with structured world models that capture the appearance, geometry, and dynamics of environments involving deformable objects. 🤖 Evaluation remains one of the biggest bottlenecks in building general-purpose robots. Today, robots are still evaluated only in the real world, which is orders of magnitude slower than the development of language agents. We propose a new framework where simulation performance strongly correlates with the real world (r > 0.9), even for deformable objects. The key difference from existing work lies in the correlation between simulation and reality: if a robot model performs better in the digital world, does it also perform better in the real world? This question has long made people hesitant about simulation-based evaluation — especially for deformable objects. We are changing that. Our pipeline achieves effective real-to-sim transfer, establishing state-of-the-art correlation between simulation and reality for deformable object manipulation. It provides a scalable and reproducible evaluation engine for robot learning. 🌐

Yunzhu Li

39,864 просмотров • 7 месяцев назад

Testing robot policies on hardware is slow, expensive and hard to scale. World models offer a promising path to accelerating robot policy development. We're sharing new research from the Runway Robotics team, in which we simulated 8 robot policies inside our General World Model and found 0.95 correlation with real-world results. Those early results point to world model simulation as a practical substitute for hardware evaluation, comparing favorably to existing real-to-sim approaches. Learn more at the link below.

Testing robot policies on hardware is slow, expensive and hard to scale. World models offer a promising path to accelerating robot policy development. We're sharing new research from the Runway Robotics team, in which we simulated 8 robot policies inside our General World Model and found 0.95 correlation with real-world results. Those early results point to world model simulation as a practical substitute for hardware evaluation, comparing favorably to existing real-to-sim approaches. Learn more at the link below.

Runway

13,464 просмотров • 4 месяцев назад

Every home is different. That means that to build a useful home robot, we must be able to perform zero-shot generalization on a wide range of tasks. Humanoid company 1X has a solution: world models. 1X Director of Evaluations Daniel Ho joins us on RoboPapers to talk about: - why world models are the future for scaling robot learning - how to use world models for robot control - what world models unlock for evaluating robot model performance - how we can hill-climb from here to general purpose robots Watch Episode #61 of RoboPapers, with Michael Cho - Rbt/Acc and Chris Paxton, now!

Every home is different. That means that to build a useful home robot, we must be able to perform zero-shot generalization on a wide range of tasks. Humanoid company 1X has a solution: world models. 1X Director of Evaluations Daniel Ho joins us on RoboPapers to talk about: - why world models are the future for scaling robot learning - how to use world models for robot control - what world models unlock for evaluating robot model performance - how we can hill-climb from here to general purpose robots Watch Episode #61 of RoboPapers, with Michael Cho - Rbt/Acc and Chris Paxton, now!

RoboPapers

27,567 просмотров • 4 месяцев назад

It’s long been a dream of roboticists to be able to teach a robot in simulation so as to skip the long and expensive process of collecting large amounts of real-world training data. However, building simulations for robot tasks is extremely hard. Ideally, we could go from real data to a useful simulation. This is exactly what Guangqi Jiang and his co-authors do. they use 3d Gaussian splatting to reconstructed scenes which let them create interactive environments that, when combined with a physcs engine, allow for training robot policies that show zero-shot sim-to-real transfer (i.e., using no real-world demonstrations). To learn more, watch Episode 56 of Robopapers with Michael Cho - Rbt/Acc and Chris Paxton now!

It’s long been a dream of roboticists to be able to teach a robot in simulation so as to skip the long and expensive process of collecting large amounts of real-world training data. However, building simulations for robot tasks is extremely hard. Ideally, we could go from real data to a useful simulation. This is exactly what Guangqi Jiang and his co-authors do. they use 3d Gaussian splatting to reconstructed scenes which let them create interactive environments that, when combined with a physcs engine, allow for training robot policies that show zero-shot sim-to-real transfer (i.e., using no real-world demonstrations). To learn more, watch Episode 56 of Robopapers with Michael Cho - Rbt/Acc and Chris Paxton now!

RoboPapers

20,434 просмотров • 6 месяцев назад

State-of-the-art robot policies often need hundreds of hours of data. What if we needed none? Introducing TiPToP: a manipulation system that zero-shots open-world tasks from pixels and language using vision foundation models and GPU-parallelized Task and Motion Planning (TAMP).

State-of-the-art robot policies often need hundreds of hours of data. What if we needed none? Introducing TiPToP: a manipulation system that zero-shots open-world tasks from pixels and language using vision foundation models and GPU-parallelized Task and Motion Planning (TAMP).

Nishanth Kumar

77,488 просмотров • 3 месяцев назад

Over the last few months, we’ve been thinking about how to learn from “off-domain” data - data from non-robot sources like video or simulation. These data sources are not quite good enough to learn policies (even monolithic VLA models) directly, but they still contain lots of information that can be useful for generalizable robot control. How can we develop robot learning models that are able to make use of this type of data for generalizable control? In new work, that we call HAMSTER, we show that VLMs can be useful for enabling robotic learning from off-domain data, but specifically when used through hierarchical VLA architectures. We show that this class of models can learn generalizable robot policies for the real world from large-scale, off-domain data. A 🧵 (1/10)

Over the last few months, we’ve been thinking about how to learn from “off-domain” data - data from non-robot sources like video or simulation. These data sources are not quite good enough to learn policies (even monolithic VLA models) directly, but they still contain lots of information that can be useful for generalizable robot control. How can we develop robot learning models that are able to make use of this type of data for generalizable control? In new work, that we call HAMSTER, we show that VLMs can be useful for enabling robotic learning from off-domain data, but specifically when used through hierarchical VLA architectures. We show that this class of models can learn generalizable robot policies for the real world from large-scale, off-domain data. A 🧵 (1/10)

Abhishek Gupta

11,994 просмотров • 1 год назад

A big part of scaling robot learning to solve real-world problems is that we somehow need to get enough diverse, high-quality data to train our robots to perform useful things. GPT and its fellow large language models were bootstrapped and proved out on a massive dataset of real-world language data. Unfortunately, despite our best efforts, similarly massive datasets don’t really exist for robotics — so, in our unending pursuit of high-quality, useful data, we turn to simulation. I compared a couple recent works on sim-to-real robot manipulation, which discuss how to train perception-driven manipulation policies in simulation, in such a way that they’re useful in the real world. - DextraH-RGB, from NVIDIA - Sim-and-Real Co-Training: A Simple Recipe for Vision-Based Robotic Manipulation, also from NVIDIA — specifically the GEAR lab - Sim-to-Real Reinforcement Learning for Vision-Based Dexterous Manipulation on Humanoids, another GEAR lab paper - Local Policies Enable Zero-shot Long-Horizon Manipulation, from CMU (video from DextrAH-RGB)

A big part of scaling robot learning to solve real-world problems is that we somehow need to get enough diverse, high-quality data to train our robots to perform useful things. GPT and its fellow large language models were bootstrapped and proved out on a massive dataset of real-world language data. Unfortunately, despite our best efforts, similarly massive datasets don’t really exist for robotics — so, in our unending pursuit of high-quality, useful data, we turn to simulation. I compared a couple recent works on sim-to-real robot manipulation, which discuss how to train perception-driven manipulation policies in simulation, in such a way that they’re useful in the real world. - DextraH-RGB, from NVIDIA - Sim-and-Real Co-Training: A Simple Recipe for Vision-Based Robotic Manipulation, also from NVIDIA — specifically the GEAR lab - Sim-to-Real Reinforcement Learning for Vision-Based Dexterous Manipulation on Humanoids, another GEAR lab paper - Local Policies Enable Zero-shot Long-Horizon Manipulation, from CMU (video from DextrAH-RGB)

Chris Paxton

20,486 просмотров • 1 год назад

.Polymath is training world generation models to automate the creation of RL environments. Traditionally, RL environment generation has been bottlenecked by human data. Superintelligence will never be achieved by human data alone. Polymath is building the core technology to enable automated environment generation using far less human effort than traditionally required, and eventually none. This allows for more complex and realistic worlds, and higher quality, scale, and diversity of tasks. This will be essential to unlock RL scaling. The end goal is to create large-scale, long-horizon environments from a text description alone. This will enable the creation of worlds of arbitrary complexity and scale, which is foundational for training & evaluating autonomous, superintelligent AI agents. Congrats on the launch, Dylan Ma and Naren Yenuganti!

.Polymath is training world generation models to automate the creation of RL environments. Traditionally, RL environment generation has been bottlenecked by human data. Superintelligence will never be achieved by human data alone. Polymath is building the core technology to enable automated environment generation using far less human effort than traditionally required, and eventually none. This allows for more complex and realistic worlds, and higher quality, scale, and diversity of tasks. This will be essential to unlock RL scaling. The end goal is to create large-scale, long-horizon environments from a text description alone. This will enable the creation of worlds of arbitrary complexity and scale, which is foundational for training & evaluating autonomous, superintelligent AI agents. Congrats on the launch, Dylan Ma and Naren Yenuganti!

Y Combinator

44,192 просмотров • 4 месяцев назад

Unitree founder Wang Xingxing: In robotics, locomotion and basic motion is mostly solved. But grasping and manipulation—anything related to haptics—hasn’t been solved. That’s the key bottleneck preventing them from being deployed at scale in factories and homes. He says that simulation is much faster for training but for manipulation tasks you still need real-world training data—for now.

Unitree founder Wang Xingxing: In robotics, locomotion and basic motion is mostly solved. But grasping and manipulation—anything related to haptics—hasn’t been solved. That’s the key bottleneck preventing them from being deployed at scale in factories and homes. He says that simulation is much faster for training but for manipulation tasks you still need real-world training data—for now.

Kyle Chan

67,474 просмотров • 1 месяц назад

Everything you love about generative models — now powered by real physics! Announcing the Genesis project — after a 24-month large-scale research collaboration involving over 20 research labs — a generative physics engine able to generate 4D dynamical worlds powered by a physics simulation platform designed for general-purpose robotics and physical AI applications. Genesis's physics engine is developed in pure Python, while being 10-80x faster than existing GPU-accelerated stacks like Isaac Gym and MJX. It delivers a simulation speed ~430,000 faster than in real-time, and takes only 26 seconds to train a robotic locomotion policy transferrable to the real world on a single RTX4090 (see tutorial: The Genesis physics engine and simulation platform is fully open source at We'll gradually roll out access to our generative framework in the near future. Genesis implements a unified simulation framework all from scratch, integrating a wide spectrum of state-of-the-art physics solvers, allowing simulation of the whole physical world in a virtual realm with the highest realism. We aim to build a universal data engine that leverages an upper-level generative framework to autonomously create physical worlds, together with various modes of data, including environments, camera motions, robotic task proposals, reward functions, robot policies, character motions, fully interactive 3D scenes, open-world articulated assets, and more, aiming towards fully automated data generation for robotics, physical AI and other applications. Open Source Code: Project webpage: Documentation: 1/n

Everything you love about generative models — now powered by real physics! Announcing the Genesis project — after a 24-month large-scale research collaboration involving over 20 research labs — a generative physics engine able to generate 4D dynamical worlds powered by a physics simulation platform designed for general-purpose robotics and physical AI applications. Genesis's physics engine is developed in pure Python, while being 10-80x faster than existing GPU-accelerated stacks like Isaac Gym and MJX. It delivers a simulation speed ~430,000 faster than in real-time, and takes only 26 seconds to train a robotic locomotion policy transferrable to the real world on a single RTX4090 (see tutorial: The Genesis physics engine and simulation platform is fully open source at We'll gradually roll out access to our generative framework in the near future. Genesis implements a unified simulation framework all from scratch, integrating a wide spectrum of state-of-the-art physics solvers, allowing simulation of the whole physical world in a virtual realm with the highest realism. We aim to build a universal data engine that leverages an upper-level generative framework to autonomously create physical worlds, together with various modes of data, including environments, camera motions, robotic task proposals, reward functions, robot policies, character motions, fully interactive 3D scenes, open-world articulated assets, and more, aiming towards fully automated data generation for robotics, physical AI and other applications. Open Source Code: Project webpage: Documentation: 1/n

Zhou Xian

3,815,891 просмотров • 1 год назад

In my past research experience, finding or developing an appropriate simulation environment, dataset, and benchmark has always been a challenge. Missing features, limited support, or unexpected bugs often occupied my days and nights. Moreover, current simulation platforms are relatively fragmented—making it challenging to replicate the success of the RT-X dataset in unifying community efforts. Introducing RoboVerse, we provide a unified platform, dataset, and benchmark for scalable and generalizable robot learning. We hope to build a shared foundation to combine the community efforts. RoboVerse includes: MetaSim: We carefully designed a configuration system and a universal interface to align current robotic simulators. With MetaSim, you can use any simulator with the same code—bringing together the community’s diverse efforts under one framework! RoboVerse Dataset and Benchmark: We unify popular simulation environments and benchmarks into a single cohesive system and introduce the RoboVerse dataset—a large-scale, high-quality synthetic dataset. Additionally, we propose a standardized benchmark across both imitation learning and reinforcement learning. A cool feature enabled by our unified framework: Hybrid Simulation! You can now integrate physics engines and renderers from different simulators—e.g., using MuJoCo precise physics with Isaac photorealistic rendering. This not only elevates simulation fidelity but also significantly enhances real-world transfer performance across complex robotic applications. Hopefully, our team’s efforts could serve the robotic community to thrive vibrantly in the years to come. RoboVerse is open-sourced🥳!!! Project Page: Documentation: Github Repo: Paper:

In my past research experience, finding or developing an appropriate simulation environment, dataset, and benchmark has always been a challenge. Missing features, limited support, or unexpected bugs often occupied my days and nights. Moreover, current simulation platforms are relatively fragmented—making it challenging to replicate the success of the RT-X dataset in unifying community efforts. Introducing RoboVerse, we provide a unified platform, dataset, and benchmark for scalable and generalizable robot learning. We hope to build a shared foundation to combine the community efforts. RoboVerse includes: MetaSim: We carefully designed a configuration system and a universal interface to align current robotic simulators. With MetaSim, you can use any simulator with the same code—bringing together the community’s diverse efforts under one framework! RoboVerse Dataset and Benchmark: We unify popular simulation environments and benchmarks into a single cohesive system and introduce the RoboVerse dataset—a large-scale, high-quality synthetic dataset. Additionally, we propose a standardized benchmark across both imitation learning and reinforcement learning. A cool feature enabled by our unified framework: Hybrid Simulation! You can now integrate physics engines and renderers from different simulators—e.g., using MuJoCo precise physics with Isaac photorealistic rendering. This not only elevates simulation fidelity but also significantly enhances real-world transfer performance across complex robotic applications. Hopefully, our team’s efforts could serve the robotic community to thrive vibrantly in the years to come. RoboVerse is open-sourced🥳!!! Project Page: Documentation: Github Repo: Paper:

Haoran Geng

84,212 просмотров • 1 год назад