#reinforcementlearning

We are excited to share our #ICRA2026 paper "Dream to Fly: Model-Based Reinforcement Learning for Vision-Based Drone Flight"! Paper: Video: Can we use Model-Based #ReinforcementLearning (MBRL) to fly a drone from pixels to commands? In this work, we train quadrotor navigation policies from scratch using #WorldModels, mapping raw onboard camera pixels directly to control commands, much like a human pilot! While model-free methods like PPO are sample-inefficient and struggle in this setting, we leverage #MBRL to train visuomotor policies capable of agile flight through a racetrack using only raw pixel observations, no explicit state estimation needed. A key finding: because our policies are trained end-to-end directly from pixels, we no longer need the perception-aware reward term used in previous methods. Instead, this behavior emerges naturally! The policies learn to guide the camera toward feature-rich areas of the observation space on their own. Kudos to Ángel Romero Ashwin Shenai Ismail Geles Elie Aljalbout Reference: "Dream to Fly: Model-Based Reinforcement Learning for Vision-Based Drone Flight" Angel Romero*, Ashwin Shenai*, Ismail Geles, Elie Aljalbout, Davide Scaramuzza IEEE International Conference on Robotics and Automation (ICRA), Vienna, 2026. European Research Council (ERC) AUTOASSESS UZH IfI University of Zurich UZH Science Prophesee SynSense UZH Space Hub Swiss Robotics NCCR Robotics
Davide Scaramuzza15,487 views • 1 month ago

Check out our latest work, "Actor-Critic Model Predictive Control: Differentiable Optimization meets Reinforcement Learning for Agile Flight," published in the IEEE Transactions on Robotics, where we reconcile #OptimalControl and #ReinforcementLearning, achieving the same super-human performance, but with superior generalizability, as our previous model-free deep RL! Code released! PDF: Code: Full Video: Model-free #ReinforcementLearning (RL) is known for its strong task performance and flexibility in optimizing general reward formulations. On the other hand, #ModelPredictiveControl (MPC) provides robustness, constraint handling, and powerful online replanning capabilities. In this work, we extend our previous AC-MPC paper (Romero, ICRA'24) by taking a deeper look at how both approaches can be unified. We introduce and extend Actor-Critic Model Predictive Control (AC-MPC), a framework that embeds a differentiable MPC inside an Actor-Critic RL architecture. This integration allows the MPC-based actor to perform short-term predictive optimization, while the critic facilitates long-horizon learning and exploration. We conduct a comprehensive study that highlights AC-MPC’s key advantages: - Better out-of-distribution generalization, both against unknown disturbances and changes in the quadrotor dynamics - Improved sample efficiency - A novel empirical analysis uncovering a relationship between the critic’s value function and the MPC cost function, providing deeper insight into their interplay. We validate our method in simulation and the real world on a quadcopter flying at superhuman speeds of up to 21 m/s, matching state-of-the-art model-free RL performance, and retaining the predictive structure of MPC for more reliable out-of-distribution behavior. Reference: Actor-Critic Model Predictive Control: Differentiable Optimization meets Reinforcement Learning for Agile Flight IEEE Transactions on Robotics (T-RO), 2025 PDF: Full Video: Code: Kudos to Ángel Romero, Elie Aljalbout, Yunlong Song! University of Zurich UZH Science UZH Space Hub AUTOASSESS European Research Council (ERC) UZHai
Davide Scaramuzza26,960 views • 4 months ago
0:59
Sensitive content
This media may contain sensitive content.

二足歩行ロボット、強化学習で野生を克服 #bipedal #robot #locomotion #ReinforcementLearning #LimXDynamics
T.Yamazaki38,581 views • 1 year ago


