Загрузка видео...

Не удалось загрузить видео

На главную

Open-Sourced Robotics Datasets Have Exploded This Year, Turning The Field Into A More Scalable And Collaborative Ecosystem. We can expect major breakthroughs in the very near future; the data for robotics is exploding!

14,321 просмотров • 5 месяцев назад •via X (Twitter)

Комментарии: 0

Нет доступных комментариев

Здесь появятся комментарии из оригинального поста

Похожие видео

A Letter to Our Community: The Road Ahead for Robotics To our Community and Partners, As we step into 2026, our mission at Axis is clearer than ever: Constructing the definitive End-to-End Scaling Layer for Robotics. Our goal is to accelerate the transfer of diverse human intelligence into Robotics General Intelligence (RGI). By owning the critical path of intelligence creation, we are turning the physical limitations of robotics into a scalable, software-driven future. Here is our strategic outlook and roadmap for the year ahead. The Core Thesis: Simulation is the Only Way Out The path to RGI is currently blocked by Data Scarcity, Generalization Fragility, and Hardware Fragmentation. At Axis, we believe Simulation is the only way out. Our Simulation Data Platform and Data Augmentation Engine transform raw data into "Synthetic Gold". Backed by academic milestones like Roboverse, Skill Blending, and GraspVLA, we have proven that pure simulation can achieve the generalization required for the real world. We don’t just collect data; we architect it. The Engine: Why Crypto? We believe RGI should come from all, not a few. Crypto is not just a feature; it is the primitive that powers our entire ecosystem flywheel: - Incentive Mechanism: Democratizing contribution and rewarding the trainers and developers. - Assetization: Turning proprietary data and refined models into liquid, ownable assets. - Verifiable Workflow: We are opening the "Black Box" of AI. By bringing total transparency to the Task Generation → Data Collection → Model Training pipeline, we ensure every byte of intelligence is verifiable, traceable, and secure. 2026 Strategic Deliverables This year, we are committed to delivering three foundational pillars: - The World's Largest Training Dataset for Robots: A robot training set—diverse, high-quality interaction data at an unprecedented scale. - A Robotics Foundation Model: A universal robotic brain trained on our pure simulation and synthetic data, capable of robust cross-embodiment transfer and open-world adaptability. - Evolvable Robot Hardware: Robots deployed with Axis models that autonomously evolve through continuous interaction, turning every deployment into a self-improving node within our RGI network. The Ultimate Vision We are building more than models; we are architecting the Distributed Machine Economy. A future where every dataset, model, and robotic embodiment is a verifiable asset in a global, autonomous network. Thank you for building the future of intelligence with us✌️📷

Axis Robotics

27,858 просмотров • 5 месяцев назад

🚀 Introducing EgoExo Forge - built on top of Rerun, Gradio, and Hugging Face hub (I’ll be in San Francisco July 21–29 — if you’re into robotics, egocentric AI, large-scale data collection, or just want to chat, DM me!) In my opinion, large-scale, diverse, and high-quality data is still the largest bottleneck for generalized robotics deployment. I believe that some version of imitation learning from human examples will be the most scalable + clean way to train humanoid robots 🤖 (similar to what Tesla did for Full Self Driving). Teleop is too expensive to collect a large enough dataset in a reasonable manner, so passive collection via egocentric (and in certain cases, exocentric) views feels like the right bet. Over the past few months, I've been trying to build out the scaffolding for this and using Rerun as my underlying infrastructure. Data being collected needs to be easily inspectable + time series and rerun provides the right tooling for this. My goal is to first build out a ground truth representative dataset from already existing open source data, generate some reasonable baselines, and then go out and collect my own data that adheres to the defined schema. 🔍 Starting with open-source datasets 1. EgoDex from Apple 2. HOCap from Nvidia and the University of Texas at Dallas 3. Assembly101 from Meta All these different datasets have different sensor configurations + annotations, so my goal with egoexo-forge is to have one consistent labeling scheme + data layout. I built a data pipeline that aligns all of the different datasets in one general schema assuming the COCO133 keypoint layout that allows for exo+ego, ego only, or exo only Since the scaffolding is already there, it becomes MUCH easier to add other datasets. So the next ones that I'll be including are HD-EPIC kitchens dataset, HOT3D, and finally my own personal iPhone + insta360 go collection method. Once I have a diverse variety of datasets, I'll double down on what I believe to be the key algorithms required to make useful data for imitation learning 📊 1. Camera Pose estimation via SLAM/SFM for ego perspective (and automatic calibration for exo) 2. Human pose estimation for both egocentric + exocentric views 3. Metric 3D reconstruction + object tracking I'll be setting up reasonable open-source baselines for each of these to validate that these datasets work, and then finally try to use the generated datasets for some imitation learning via the pi0-lerobot repo I've been working on. I plan on making a blog post + providing more info on all of this in the near future so stay tuned

Pablo Vela

32,085 просмотров • 11 месяцев назад