正在加载视频...

视频加载失败

3 mo. ago we released the Open X-Embodiment dataset, today we’re doing the next step: Introducing Octo 🐙, a generalist robot policy, trained on 800k robot trajectories, stronger than RT-1X, flexible observation + action spaces, fully open source! 💻: /🧵

126,658 次观看 • 2 年前 •via X (Twitter)

13 条评论

Karl Pertsch 的头像
Karl Pertsch2 年前

Out of the box, Octo can control multiple robots, use 3rd person + wrist cameras, language instructions & goal images. Key feature: Octo can be quickly finetuned to use new observation & action spaces! In <5 hours on a 24 GB VRAM GPU! 2/

Karl Pertsch 的头像
Karl Pertsch2 年前

If we want to build truly “foundational” models for robotics we need to support the diversity of real robot setups! Despite the added flexibility, we find Octo's performance to be strong compared to RT-1X and even RT-2X + great during finetuning! 3/

Karl Pertsch 的头像
Karl Pertsch2 年前

Octo is built to scale: it’s a big transformer with small encoders at the input and a small action head at the output. We use diffusion action decoding for max expressiveness 4/

Karl Pertsch 的头像
Karl Pertsch2 年前

We’re fully open-sourcing model checkpoints, our pre-training and finetuning pipelines! Initially, Octo comes in two sizes: Octo-Small (27M params) and Octo-Base (93M params). All models are on HuggingFace, so loading an Octo model is as easy as this: 5/

Karl Pertsch 的头像
Karl Pertsch2 年前

We’re releasing a tech report with lots of details on what worked and, importantly, what didn’t -- go check it out! 📜: 6/

Karl Pertsch 的头像
Karl Pertsch2 年前

Last but not least: Octo is your one-stop-shop for training on OpenX data! We’re releasing high-quality data loaders that work with PyTorch and JAX + a curated dataset split! 7/

Karl Pertsch 的头像
Karl Pertsch2 年前

Octo is only the first step towards building generalist robot policies and we’re planning to improve the models over time — larger sizes, more robot morphologies, RL etc etc — really excited to see how folks will use Octo! :) 8/

Karl Pertsch 的头像
Karl Pertsch2 年前

This was a big team effort w/ collaborator from UC Berkeley, Stanford & CMU! I'm very grateful to all collaborators!! :) @its_dibya @HomerWalke @kvablack @oier_mees @SudeepDasari @JoeyHejna Tobias Kreiman, Charles Xu @jianlanluo You Liang Tan @DorsaSadigh @chelseabfinn @svlevine

Karl Pertsch 的头像
Karl Pertsch2 年前

Adding the Twitter threads from all my amazing co-leads on the project! Truly inspiring to have so many people work so hard on a common goal! <3

Karl Pertsch 的头像
Karl Pertsch2 年前

led base model development & training, and implemented many of the features that make the Octo code easy to use!

Karl Pertsch 的头像
Karl Pertsch2 年前

led model evaluation, designed our internal eval bench for iterating on the model & ran many of the evals in the tech report.

Karl Pertsch 的头像
Karl Pertsch2 年前

led data & training infrastructure -- that sweet Octo OpenX data loader is in large parts Kevin's baby -- loading 25 video datasets concurrently at high speed is no easy feat! Kevin also had large contributions in making Octo easier to use!

Karl Pertsch 的头像
Karl Pertsch2 年前

ran many model ablations & evals for the tech report, integrated pre-trained language encoders & last but not least, kept the spirits high during long nights "in the arena" ♥️

相关视频