#reinforcementlearning

Anya Rossi

sweetdream.ai

SweetDream.ai•Sponsored•Livecam

Watch Anya Live

Anya is streaming live right now! Join her private show and enjoy exclusive content.

Exclusive private shows

1.2k viewers online

Private Show

Join now for exclusive access

Free preview available • Premium content

映画『アナと雪の女王』の人気キャラクター「オラフ」を、画面の中の動きそのままにロボットとして現実世界で“生きているように” 動かす研究 #animation #character #robot #mechanical #design #ReinforcementLearning #Olaf #DisneyResearchHub

映画『アナと雪の女王』の人気キャラクター「オラフ」を、画面の中の動きそのままにロボットとして現実世界で“生きているように” 動かす研究 #animation #character #robot #mechanical #design #ReinforcementLearning #Olaf #DisneyResearchHub

415,849 views • 7 months ago

@DisneyResearch introduces their new robot at #IROS2023! Trained in simulation with #reinforcementlearning! ieeeIROS

@DisneyResearch introduces their new robot at #IROS2023! Trained in simulation with #reinforcementlearning! ieeeIROS

Davide Scaramuzza

1,313,677 views • 2 years ago

ロボット犬と一緒に買い出し。重い荷物も運んでくれる #quadrupedal #quadruped #robotdog #robotics #reinforcementlearning #ArtificialInteligence #DeepRobotics

ロボット犬と一緒に買い出し。重い荷物も運んでくれる #quadrupedal #quadruped #robotdog #robotics #reinforcementlearning #ArtificialInteligence #DeepRobotics

418,120 views • 2 years ago

In-hand cylinder rotation with RL. IROS 2025 Different sizes, zero-shot sim-to-real, and full dynamic tactile feedback — all in one hand. #Sharpa #SharpaWave #IROS2025 #Robotics #DexterousHand #Humanoid #TactileSensing #ReinforcementLearning #IssacSim

In-hand cylinder rotation with RL. IROS 2025 Different sizes, zero-shot sim-to-real, and full dynamic tactile feedback — all in one hand. #Sharpa #SharpaWave #IROS2025 #Robotics #DexterousHand #Humanoid #TactileSensing #ReinforcementLearning #IssacSim

112,434 views • 9 months ago

「ダメージを最小限にして被害を抑えるつつ、恰好良く転ぶ」を学ぶ二足歩行ロボットエンターテイメント(ステージでのパフォーマンスなど)に有用かもしれない #bipedal #humanoidrobot #ReinforcementLearning #DisneyResearchHub #entertainment

「ダメージを最小限にして被害を抑えるつつ、恰好良く転ぶ」を学ぶ二足歩行ロボットエンターテイメント(ステージでのパフォーマンスなど)に有用かもしれない #bipedal #humanoidrobot #ReinforcementLearning #DisneyResearchHub #entertainment

89,611 views • 8 months ago

用途に合わせて双腕、二足歩行、車輪付き式など、自由に構成できるジュール構成ロボット #bipedal #Wheeled_Legged #DualArm #robot #EmbodiedIntelligence #ReinforcementLearning #LimxTron2 #tron2 #LimxDynamics

用途に合わせて双腕、二足歩行、車輪付き式など、自由に構成できるジュール構成ロボット #bipedal #Wheeled_Legged #DualArm #robot #EmbodiedIntelligence #ReinforcementLearning #LimxTron2 #tron2 #LimxDynamics

76,205 views • 7 months ago

Transformer robot is now opening doors for you! ETH Zürich @swiss_mile D-MAVT, ETH Zurich Full video: Authors: Clemens Schwarke, Victor Klemm, Matthijs van der Boon, Marko Bjelonic, and Marco Hutter #robotics #ai #reinforcementlearning #ethzurich

Transformer robot is now opening doors for you! ETH Zürich @swiss_mile D-MAVT, ETH Zurich Full video: Authors: Clemens Schwarke, Victor Klemm, Matthijs van der Boon, Marko Bjelonic, and Marco Hutter #robotics #ai #reinforcementlearning #ethzurich

Robotic Systems Lab

105,504 views • 2 years ago

Check out our latest work, "Actor-Critic Model Predictive Control: Differentiable Optimization meets Reinforcement Learning for Agile Flight," published in the IEEE Transactions on Robotics, where we reconcile #OptimalControl and #ReinforcementLearning, achieving the same super-human performance, but with superior generalizability, as our previous model-free deep RL! Code released! PDF: Code: Full Video: Model-free #ReinforcementLearning (RL) is known for its strong task performance and flexibility in optimizing general reward formulations. On the other hand, #ModelPredictiveControl (MPC) provides robustness, constraint handling, and powerful online replanning capabilities. In this work, we extend our previous AC-MPC paper (Romero, ICRA'24) by taking a deeper look at how both approaches can be unified. We introduce and extend Actor-Critic Model Predictive Control (AC-MPC), a framework that embeds a differentiable MPC inside an Actor-Critic RL architecture. This integration allows the MPC-based actor to perform short-term predictive optimization, while the critic facilitates long-horizon learning and exploration. We conduct a comprehensive study that highlights AC-MPC’s key advantages: - Better out-of-distribution generalization, both against unknown disturbances and changes in the quadrotor dynamics - Improved sample efficiency - A novel empirical analysis uncovering a relationship between the critic’s value function and the MPC cost function, providing deeper insight into their interplay. We validate our method in simulation and the real world on a quadcopter flying at superhuman speeds of up to 21 m/s, matching state-of-the-art model-free RL performance, and retaining the predictive structure of MPC for more reliable out-of-distribution behavior. Reference: Actor-Critic Model Predictive Control: Differentiable Optimization meets Reinforcement Learning for Agile Flight IEEE Transactions on Robotics (T-RO), 2025 PDF: Full Video: Code: Kudos to Ángel Romero, Elie Aljalbout, Yunlong Song! University of Zurich UZH Science UZH Space Hub AUTOASSESS European Research Council (ERC) UZHai

Check out our latest work, "Actor-Critic Model Predictive Control: Differentiable Optimization meets Reinforcement Learning for Agile Flight," published in the IEEE Transactions on Robotics, where we reconcile #OptimalControl and #ReinforcementLearning, achieving the same super-human performance, but with superior generalizability, as our previous model-free deep RL! Code released! PDF: Code: Full Video: Model-free #ReinforcementLearning (RL) is known for its strong task performance and flexibility in optimizing general reward formulations. On the other hand, #ModelPredictiveControl (MPC) provides robustness, constraint handling, and powerful online replanning capabilities. In this work, we extend our previous AC-MPC paper (Romero, ICRA'24) by taking a deeper look at how both approaches can be unified. We introduce and extend Actor-Critic Model Predictive Control (AC-MPC), a framework that embeds a differentiable MPC inside an Actor-Critic RL architecture. This integration allows the MPC-based actor to perform short-term predictive optimization, while the critic facilitates long-horizon learning and exploration. We conduct a comprehensive study that highlights AC-MPC’s key advantages: - Better out-of-distribution generalization, both against unknown disturbances and changes in the quadrotor dynamics - Improved sample efficiency - A novel empirical analysis uncovering a relationship between the critic’s value function and the MPC cost function, providing deeper insight into their interplay. We validate our method in simulation and the real world on a quadcopter flying at superhuman speeds of up to 21 m/s, matching state-of-the-art model-free RL performance, and retaining the predictive structure of MPC for more reliable out-of-distribution behavior. Reference: Actor-Critic Model Predictive Control: Differentiable Optimization meets Reinforcement Learning for Agile Flight IEEE Transactions on Robotics (T-RO), 2025 PDF: Full Video: Code: Kudos to Ángel Romero, Elie Aljalbout, Yunlong Song! University of Zurich UZH Science UZH Space Hub AUTOASSESS European Research Council (ERC) UZHai

Davide Scaramuzza

27,090 views • 6 months ago

アトラス、人間のモーションキャプチャーとアニメーションを参考に、強化学習を用いて開発されたポリシーを実演 #bipedal #humanoid #robot #Atlas #BostonDynamics #ReinforcementLearning #RLPolicy

アトラス、人間のモーションキャプチャーとアニメーションを参考に、強化学習を用いて開発されたポリシーを実演 #bipedal #humanoid #robot #Atlas #BostonDynamics #ReinforcementLearning #RLPolicy

54,769 views • 1 year ago

We are excited to share our #ICRA2026 paper "Dream to Fly: Model-Based Reinforcement Learning for Vision-Based Drone Flight"! Paper: Video: Can we use Model-Based #ReinforcementLearning (MBRL) to fly a drone from pixels to commands? In this work, we train quadrotor navigation policies from scratch using #WorldModels, mapping raw onboard camera pixels directly to control commands, much like a human pilot! While model-free methods like PPO are sample-inefficient and struggle in this setting, we leverage #MBRL to train visuomotor policies capable of agile flight through a racetrack using only raw pixel observations, no explicit state estimation needed. A key finding: because our policies are trained end-to-end directly from pixels, we no longer need the perception-aware reward term used in previous methods. Instead, this behavior emerges naturally! The policies learn to guide the camera toward feature-rich areas of the observation space on their own. Kudos to Ángel Romero Ashwin Shenai Ismail Geles Elie Aljalbout Reference: "Dream to Fly: Model-Based Reinforcement Learning for Vision-Based Drone Flight" Angel Romero*, Ashwin Shenai*, Ismail Geles, Elie Aljalbout, Davide Scaramuzza IEEE International Conference on Robotics and Automation (ICRA), Vienna, 2026. European Research Council (ERC) AUTOASSESS UZH IfI University of Zurich UZH Science Prophesee SynSense UZH Space Hub Swiss Robotics NCCR Robotics

We are excited to share our #ICRA2026 paper "Dream to Fly: Model-Based Reinforcement Learning for Vision-Based Drone Flight"! Paper: Video: Can we use Model-Based #ReinforcementLearning (MBRL) to fly a drone from pixels to commands? In this work, we train quadrotor navigation policies from scratch using #WorldModels, mapping raw onboard camera pixels directly to control commands, much like a human pilot! While model-free methods like PPO are sample-inefficient and struggle in this setting, we leverage #MBRL to train visuomotor policies capable of agile flight through a racetrack using only raw pixel observations, no explicit state estimation needed. A key finding: because our policies are trained end-to-end directly from pixels, we no longer need the perception-aware reward term used in previous methods. Instead, this behavior emerges naturally! The policies learn to guide the camera toward feature-rich areas of the observation space on their own. Kudos to Ángel Romero Ashwin Shenai Ismail Geles Elie Aljalbout Reference: "Dream to Fly: Model-Based Reinforcement Learning for Vision-Based Drone Flight" Angel Romero, Ashwin Shenai, Ismail Geles, Elie Aljalbout, Davide Scaramuzza IEEE International Conference on Robotics and Automation (ICRA), Vienna, 2026. European Research Council (ERC) AUTOASSESS UZH IfI University of Zurich UZH Science Prophesee SynSense UZH Space Hub Swiss Robotics NCCR Robotics

Davide Scaramuzza

15,965 views • 3 months ago

“There are no authorities in science,” says Turing Award winner Richard Sutton, Amii Fellow & Canada CIFAR AI Chair. Sit down with Rich and Cam Linke as they discuss the journey to this moment. Watch now: #TuringAward #AI #ReinforcementLearning

“There are no authorities in science,” says Turing Award winner Richard Sutton, Amii Fellow & Canada CIFAR AI Chair. Sit down with Rich and Cam Linke as they discuss the journey to this moment. Watch now: #TuringAward #AI #ReinforcementLearning

37,583 views • 1 year ago