Загрузка видео...

Не удалось загрузить видео

На главную

Model progress is no longer constrained by architecture, but by access to high-quality, human-generated data. Scraped internet data is finite, low-signal and increasingly synthetic. Kled AI pays real people to upload authentic, real world content task by task in their mobile app. We are super excited to back Avi...

36,782 просмотров • 3 месяцев назад •via X (Twitter)

Комментарии: 0

Нет доступных комментариев

Здесь появятся комментарии из оригинального поста

Похожие видео

Synthetic data will provide the next trillion tokens to fuel our hungry models. I'm excited to announce MimicGen: massively scaling up data pipeline for robot learning! We multiply high-quality human data in simulation with digital twins. Using 50,000 training episodes across 18 tasks, multiple simulators, and even in the real-world! The idea is simple: 1. Humans tele-operate the robot to complete a task. It is extremely high-quality but also very slow and expensive. 2. We create a digital twin of the robot and the scene in high-fidelity, GPU-accelerated simulation. 3. We can now move objects around, replace with new assets, and even change the robot hand - basically augment the training data with procedural generation. 4. Export the successful episodes, and feed that to a neural network! You now have an near-infinite stream of data. One of the key reasons that robotics lags far behind other AI fields is the lack of data: you cannot scrape control signals from the internet. They simply don't exist in-the-wild. MimicGen shows the power of synthetic data and simulation to keep our scaling laws alive. I believe this principle apply beyond robotics. We are quickly exhausting the high-quality, real tokens from the web. Artificial intelligence from artificial data will be the way forward. We are big fans of the OSS community. As usual, we open-source everything, including the generated dataset! - Website: - Paper: - Dataset is hosted on HuggingFace (thanks AK!!): - Code: MimicGen is led by Ajay Mandlekar, deep dive in the thread:

Jim Fan

332,164 просмотров • 2 лет назад