Animesh Garg's banner

Animesh Garg

@animesh_garg • 35,205 subscribers

Foundation Models for Generalizable Autonomy in Robotics. Reinforcement Learning. Faculty in AI Robotics @GeorgiaTech. Prev @nvidia @UCberkeley @stanford

Shorts

Model-Free Reinforcement Learning (MFRL) has been alluring, especially with supercharged compute with physics on GPU. However, the methods use 0-th order gradients, and are often not the best optimizers. Can we do better than PPO in continuous control for robotics? Turns out yes! 🥳 tl;dr: Faster, better RL than PPO in continuous control 💪 The answer lies in using more information from the simulation. We are juicing the simulation on GPU as it is, why not use it for gradients as well? This has been a driving question in a series of our works. We first studied this problem in ICLR 2022 paper on Short Horizon Actor Critic Naive gradient based methods are stuck in local minima and have exploding/vanishing gradients. SHAC solved this problem truncated rollouts and model based value estimation, where the model is Differentiable Sim. This boosted sample efficiency and wall-clock time immensely especially in high dimensional systems such as humanoids Yet, given enough compute PPO often caught up. Our follow up paper on on Adaptive Horizon Actor Critic at ICML 2024 discovers the cause and provides a fix. However, we find that even when given ground-truth dynamics, not all gradients are useful due to sample error. 1st-Order Model-Based Reinforcement Learning methods employing differentiable simulation provide gradients with reduced variance but are susceptible to bias in scenarios involving stiff dynamics, such as physical contact. We find that back-propagating through contact and long trajectories drastically reduces gradient accuracy. Using this insight, we propose AHAC to dynamically adapt its roll-out horizon to avoid differentiating through stiff contact. AHAC is a first-order model-based RL algorithm that learns high-dimensional tasks in minutes (wall clock) and outperforms PPO by 40%, even in the limit of data provided to PPO. This work is led by Ignat Georgiev alongside Krishnan Srinivasan, Jie Xu, Eric Heiden and ample assistance from warp team at NVIDIA Robotics (Miles Macklin)

Model-Free Reinforcement Learning (MFRL) has been alluring, especially with supercharged compute with physics on GPU. However, the methods use 0-th order gradients, and are often not the best optimizers. Can we do better than PPO in continuous control for robotics? Turns out yes! 🥳 tl;dr: Faster, better RL than PPO in continuous control 💪 The answer lies in using more information from the simulation. We are juicing the simulation on GPU as it is, why not use it for gradients as well? This has been a driving question in a series of our works. We first studied this problem in ICLR 2022 paper on Short Horizon Actor Critic Naive gradient based methods are stuck in local minima and have exploding/vanishing gradients. SHAC solved this problem truncated rollouts and model based value estimation, where the model is Differentiable Sim. This boosted sample efficiency and wall-clock time immensely especially in high dimensional systems such as humanoids Yet, given enough compute PPO often caught up. Our follow up paper on on Adaptive Horizon Actor Critic at ICML 2024 discovers the cause and provides a fix. However, we find that even when given ground-truth dynamics, not all gradients are useful due to sample error. 1st-Order Model-Based Reinforcement Learning methods employing differentiable simulation provide gradients with reduced variance but are susceptible to bias in scenarios involving stiff dynamics, such as physical contact. We find that back-propagating through contact and long trajectories drastically reduces gradient accuracy. Using this insight, we propose AHAC to dynamically adapt its roll-out horizon to avoid differentiating through stiff contact. AHAC is a first-order model-based RL algorithm that learns high-dimensional tasks in minutes (wall clock) and outperforms PPO by 40%, even in the limit of data provided to PPO. This work is led by Ignat Georgiev alongside Krishnan Srinivasan, Jie Xu, Eric Heiden and ample assistance from warp team at NVIDIA Robotics (Miles Macklin)

52,300 просмотров

How can robots reliably place objects in diverse real-world tasks? 🤖🔍 Placement is tough—objects vary in shape and placement modes (such as stacking, hanging, and insertion), making it a challenging problem. We introduce AnyPlace, a two-stage method trained purely on synthetic data to predict diverse placement poses of unseen objects for real-world tasks. Read on for more👇

How can robots reliably place objects in diverse real-world tasks? 🤖🔍 Placement is tough—objects vary in shape and placement modes (such as stacking, hanging, and insertion), making it a challenging problem. We introduce AnyPlace, a two-stage method trained purely on synthetic data to predict diverse placement poses of unseen objects for real-world tasks. Read on for more👇

25,378 просмотров

Videos

Anya Rossi

sweetdream.ai

SweetDream.ai•Sponsored•Livecam

Watch Anya Live

Anya is streaming live right now! Join her private show and enjoy exclusive content.

Exclusive private shows

1.2k viewers online

Private Show

Join now for exclusive access

Free preview available • Premium content

Robotics is still data starved. Collecting high-quality robot demonstrations remains brutally slow and expensive. Introducing COBALT: A cloud-native teleoperation platform designed for large-scale robot learning. We are democratizing data collection by leveraging the hardware everyone already owns: the smartphone All you need is to download an app (today)! Read on for more!

Robotics is still data starved. Collecting high-quality robot demonstrations remains brutally slow and expensive. Introducing COBALT: A cloud-native teleoperation platform designed for large-scale robot learning. We are democratizing data collection by leveraging the hardware everyone already owns: the smartphone All you need is to download an app (today)! Read on for more!

103,131 просмотров • 1 месяц назад

Great result from Genesis AI but a little overzealous marketing. Anna Heim (Anna Heim) might want to issue an update

Great result from Genesis AI but a little overzealous marketing. Anna Heim (Anna Heim) might want to issue an update

10,936 просмотров • 2 месяцев назад

Every wondered if we can model motion as a language? can we tokenize this new language? is it useful? Turns out tremendously! 🚀 In out latest #NeurIPS2024 paper on QueST: Self-Supervised Skill Abstractions for Learning Continuous Control, we find that action tokenization matters a lot! We can learn skill encodings by representing temporal action abstractions with a discrete codebook. This enables 2 things 1. Better Behaviour Cloning: we can better assimilate multi-task data (>9%) over best paper. This is currently best in class BC method! 2. generalization of this language to represent new tasks in 5-shot transfer to longer horizon tasks! Check out the thread by Atharva Mete for more details. And check out more details at: Joint work with Atharva Mete Albert Wilcox Haotian Xue Yongxin Chen Georgia Tech School of Interactive Computing Machine Learning at Georgia Tech Robotics@GT NVIDIA Robotics

Every wondered if we can model motion as a language? can we tokenize this new language? is it useful? Turns out tremendously! 🚀 In out latest #NeurIPS2024 paper on QueST: Self-Supervised Skill Abstractions for Learning Continuous Control, we find that action tokenization matters a lot! We can learn skill encodings by representing temporal action abstractions with a discrete codebook. This enables 2 things 1. Better Behaviour Cloning: we can better assimilate multi-task data (>9%) over best paper. This is currently best in class BC method! 2. generalization of this language to represent new tasks in 5-shot transfer to longer horizon tasks! Check out the thread by Atharva Mete for more details. And check out more details at: Joint work with Atharva Mete Albert Wilcox Haotian Xue Yongxin Chen Georgia Tech School of Interactive Computing Machine Learning at Georgia Tech Robotics@GT NVIDIA Robotics

26,218 просмотров • 1 год назад

Need a helping hand in the lab? Tired of manually doing tedious chemistry experiments over and over? Meet ORGANA 🤖 🧪 – a modular and user-friendly robotic lab assistant that can interact, plan and automate a number of chemistry experiments!

Need a helping hand in the lab? Tired of manually doing tedious chemistry experiments over and over? Meet ORGANA 🤖 🧪 – a modular and user-friendly robotic lab assistant that can interact, plan and automate a number of chemistry experiments!

15,084 просмотров • 2 лет назад

Больше нет контента для загрузки