Video wird geladen...

Video konnte nicht geladen werden

Beim Laden dieses Videos ist ein Problem aufgetreten. Dies könnte an einem vorübergehenden Netzwerkproblem liegen oder das Video ist möglicherweise nicht verfügbar.

PPO has long dominated robot locomotion training in simulation. SAC, despite its sample efficiency, couldn't keep up. We analyze why: 🔗 🔥Integrated into RSL-RL, our approach requires only minimal changes, making SAC a drop-in alternative out of the box.

Robotic Systems Lab

29,529 subscribers

41,757 Aufrufe • vor 11 Tagen •via X (Twitter)

Anya Rossi• Live Now

Private livecam show

0 Kommentare

Keine Kommentare verfügbar

Kommentare vom Original-Post werden hier angezeigt

Ähnliche Videos

(1/n) Since its publication in 2017, PPO has essentially become synonymous with RL. Today, we are excited to provide you with a better alternative - EPO.

(1/n) Since its publication in 2017, PPO has essentially become synonymous with RL. Today, we are excited to provide you with a better alternative - EPO.

Jianren Wang

145,520 Aufrufe • vor 1 Jahr

We organized an RL competition during the first Openmind Research Institute Winter School in Malaysia. The participants were able to implement SARSA and SAC in just 2 days onboard our Embodied MuJoCo Ant! 🎉

We organized an RL competition during the first Openmind Research Institute Winter School in Malaysia. The participants were able to implement SARSA and SAC in just 2 days onboard our Embodied MuJoCo Ant! 🎉

sorina

17,860 Aufrufe • vor 4 Monaten

Model-Free Reinforcement Learning (MFRL) has been alluring, especially with supercharged compute with physics on GPU. However, the methods use 0-th order gradients, and are often not the best optimizers. Can we do better than PPO in continuous control for robotics? Turns out yes! 🥳 tl;dr: Faster, better RL than PPO in continuous control 💪 The answer lies in using more information from the simulation. We are juicing the simulation on GPU as it is, why not use it for gradients as well? This has been a driving question in a series of our works. We first studied this problem in ICLR 2022 paper on Short Horizon Actor Critic Naive gradient based methods are stuck in local minima and have exploding/vanishing gradients. SHAC solved this problem truncated rollouts and model based value estimation, where the model is Differentiable Sim. This boosted sample efficiency and wall-clock time immensely especially in high dimensional systems such as humanoids Yet, given enough compute PPO often caught up. Our follow up paper on on Adaptive Horizon Actor Critic at ICML 2024 discovers the cause and provides a fix. However, we find that even when given ground-truth dynamics, not all gradients are useful due to sample error. 1st-Order Model-Based Reinforcement Learning methods employing differentiable simulation provide gradients with reduced variance but are susceptible to bias in scenarios involving stiff dynamics, such as physical contact. We find that back-propagating through contact and long trajectories drastically reduces gradient accuracy. Using this insight, we propose AHAC to dynamically adapt its roll-out horizon to avoid differentiating through stiff contact. AHAC is a first-order model-based RL algorithm that learns high-dimensional tasks in minutes (wall clock) and outperforms PPO by 40%, even in the limit of data provided to PPO. This work is led by Ignat Georgiev alongside Krishnan Srinivasan, Jie Xu, Eric Heiden and ample assistance from warp team at NVIDIA Robotics (Miles Macklin)

Model-Free Reinforcement Learning (MFRL) has been alluring, especially with supercharged compute with physics on GPU. However, the methods use 0-th order gradients, and are often not the best optimizers. Can we do better than PPO in continuous control for robotics? Turns out yes! 🥳 tl;dr: Faster, better RL than PPO in continuous control 💪 The answer lies in using more information from the simulation. We are juicing the simulation on GPU as it is, why not use it for gradients as well? This has been a driving question in a series of our works. We first studied this problem in ICLR 2022 paper on Short Horizon Actor Critic Naive gradient based methods are stuck in local minima and have exploding/vanishing gradients. SHAC solved this problem truncated rollouts and model based value estimation, where the model is Differentiable Sim. This boosted sample efficiency and wall-clock time immensely especially in high dimensional systems such as humanoids Yet, given enough compute PPO often caught up. Our follow up paper on on Adaptive Horizon Actor Critic at ICML 2024 discovers the cause and provides a fix. However, we find that even when given ground-truth dynamics, not all gradients are useful due to sample error. 1st-Order Model-Based Reinforcement Learning methods employing differentiable simulation provide gradients with reduced variance but are susceptible to bias in scenarios involving stiff dynamics, such as physical contact. We find that back-propagating through contact and long trajectories drastically reduces gradient accuracy. Using this insight, we propose AHAC to dynamically adapt its roll-out horizon to avoid differentiating through stiff contact. AHAC is a first-order model-based RL algorithm that learns high-dimensional tasks in minutes (wall clock) and outperforms PPO by 40%, even in the limit of data provided to PPO. This work is led by Ignat Georgiev alongside Krishnan Srinivasan, Jie Xu, Eric Heiden and ample assistance from warp team at NVIDIA Robotics (Miles Macklin)

Animesh Garg

52,279 Aufrufe • vor 2 Jahren

Trained in Simulation Our robot learns to walk naturally similar to a human via a high fidelity physics simulator We simulate years of data in only a few hours

Trained in Simulation Our robot learns to walk naturally similar to a human via a high fidelity physics simulator We simulate years of data in only a few hours

Figure

40,444 Aufrufe • vor 1 Jahr

I've gotten a mujoco sim RL training loop for a unitree robot at 200k SPS. I'm looking into the physics for friction, contact dynamics. My goal: can I reproduce & beat the mujoco playground RL baselines This is running in my web browser with raylib. Its the baseline

I've gotten a mujoco sim RL training loop for a unitree robot at 200k SPS. I'm looking into the physics for friction, contact dynamics. My goal: can I reproduce & beat the mujoco playground RL baselines This is running in my web browser with raylib. Its the baseline

kache

29,377 Aufrufe • vor 11 Tagen

A long time in the making, Into The Unknown, our 3rd studio album, is out now!

A long time in the making, Into The Unknown, our 3rd studio album, is out now!

Nero

49,102 Aufrufe • vor 1 Jahr

cul de sac cul de sac cul de sac cul de sac cul de sac cul de sac cul de sac cul de sac cul de sac cul de sac cul de sac cul de sac cul de sac cul de sac cul de sac cul de sac cul de sac cul de sac cul de sac cul de sac cul de sac cul de sac cul de sac cul de sac cul de sac

cul de sac cul de sac cul de sac cul de sac cul de sac cul de sac cul de sac cul de sac cul de sac cul de sac cul de sac cul de sac cul de sac cul de sac cul de sac cul de sac cul de sac cul de sac cul de sac cul de sac cul de sac cul de sac cul de sac cul de sac cul de sac

idyllicgurl⋆. 𐙚 ˚

16,375 Aufrufe • vor 5 Monaten

The male Trachycephalus coriaceus inflates its large vocal sac to produce powerful mating calls, helping it stand out in dense rainforest soundscapes.

The male Trachycephalus coriaceus inflates its large vocal sac to produce powerful mating calls, helping it stand out in dense rainforest soundscapes.

Massimo

25,385 Aufrufe • vor 11 Monaten

These three bunts(Safety/Non directional sac/Directional sac bunt) all occurred in the 7th inning of yesterday’s game between the Reds & Padres.

These three bunts(Safety/Non directional sac/Directional sac bunt) all occurred in the 7th inning of yesterday’s game between the Reds & Padres.

Jerry Weinstein

40,243 Aufrufe • vor 11 Tagen

Austin Wells drives in his fourth run of the night with a sac fly

Austin Wells drives in his fourth run of the night with a sac fly

Talkin' Yanks

49,771 Aufrufe • vor 1 Jahr

EXCLUSIVE🔥New Sac State coach Coach Marion: “The vision of Dr. Luke Wood and Mark Orr has my family and I very excited to build this program into a national power. GoGo Hornets!” Read more online at Sac State Football Sacramento State

EXCLUSIVE🔥New Sac State coach Coach Marion: “The vision of Dr. Luke Wood and Mark Orr has my family and I very excited to build this program into a national power. GoGo Hornets!” Read more online at Sac State Football Sacramento State

Sacramento OBSERVER

37,965 Aufrufe • vor 1 Jahr

A couple of knocks and a sac fly for BJ Gibson today. Been impressive to watch him compete in the box after nearly 18 months away from baseball. Only 4 Ks in 17 preseason PAs. Has only swung-and-missed on 4 pitches. Patrolled CF well the last 2 days. His SF and one 1B:

A couple of knocks and a sac fly for BJ Gibson today. Been impressive to watch him compete in the box after nearly 18 months away from baseball. Only 4 Ks in 17 preseason PAs. Has only swung-and-missed on 4 pitches. Patrolled CF well the last 2 days. His SF and one 1B:

Brett Nevitt

22,076 Aufrufe • vor 1 Jahr

DDG had the whole crowd turnt up at SAC State University 🔥

DDG had the whole crowd turnt up at SAC State University 🔥

clip 🛸

69,647 Aufrufe • vor 9 Monaten

Sac TN only🐆

Sac TN only🐆

priscillaxxx

14,112 Aufrufe • vor 1 Monat

Volpe knocks one in on a sac fly!

Volpe knocks one in on a sac fly!

Talkin' Yanks

58,566 Aufrufe • vor 9 Monaten

On-site in SAC 📍

On-site in SAC 📍

OKC THUNDER

18,392 Aufrufe • vor 1 Jahr

🏆2024 SAC CHAMPIONS🏆 After losing Game 1 of the SAC Tournament, the Bears go on to WIN 6 STRAIGHT to take the SAC Championship back to Hickory‼️

🏆2024 SAC CHAMPIONS🏆 After losing Game 1 of the SAC Tournament, the Bears go on to WIN 6 STRAIGHT to take the SAC Championship back to Hickory‼️

Lenoir-Rhyne Bears

11,877 Aufrufe • vor 2 Jahren

Over 6,200 in attendance for today’s Sac State Spring football game. That’s the largest crowd EVER for the annual Sac State Spring football game.

Over 6,200 in attendance for today’s Sac State Spring football game. That’s the largest crowd EVER for the annual Sac State Spring football game.

Kevin John

23,403 Aufrufe • vor 1 Jahr

𝗛𝗔𝗥𝗧𝗦𝗘𝗟𝗟𝗘 𝗧𝗜𝗘𝗦 𝗜𝗧 ‼️ Bevill Baseball recruit & ‘24 OF Brody Leathers (brody leathers; Hartselle Baseball) delivers with a sac fly into left field to tie the game up at 6-6. Hartselle has a runner on second and two-outs in the frame. #ALHS24

𝗛𝗔𝗥𝗧𝗦𝗘𝗟𝗟𝗘 𝗧𝗜𝗘𝗦 𝗜𝗧 ‼️ Bevill Baseball recruit & ‘24 OF Brody Leathers (brody leathers; Hartselle Baseball) delivers with a sac fly into left field to tie the game up at 6-6. Hartselle has a runner on second and two-outs in the frame. #ALHS24

Prep Baseball Alabama

15,223 Aufrufe • vor 2 Jahren