Sergey Levine

@svlevine • 131,426 subscribers

Associate Professor at UC Berkeley Co-founder, Physical Intelligence

Shorts

40,525 views

Videos

sweetdream.ai

Watch Anya Live

Anya is streaming live right now! Join her private show and enjoy exclusive content.

Private Show

Join now for exclusive access

Free preview available • Premium content

74,382 views • 1 month ago

29,139 views • 15 days ago

36,440 views • 23 days ago

60,722 views • 3 months ago

152,824 views • 1 year ago

81,283 views • 6 months ago

57,905 views • 8 months ago

114,997 views • 1 year ago

165,052 views • 3 years ago

73,688 views • 1 year ago

31,910 views • 4 months ago

126,523 views • 2 years ago

72,468 views • 1 year ago

105,579 views • 2 years ago

44,229 views • 11 months ago

27,242 views • 5 months ago

52,979 views • 1 year ago

48,747 views • 1 year ago

21,049 views • 5 months ago

43,464 views • 1 year ago

Live Cam

Sergey Levine

Shorts

Tmrw(Sat) I'll talk about pi-zero &amp; HIL-SERL at the CoRL workshop on Manipulation in a World of Abundant data at 9 am CET (Jupiter room)! Also leading discussion about cross-embodiment in the X-Embodiment workshop at 9:30 am (Terra) Come find out about cool recent advances!

Videos

Watch Anya Live

Flow reversal steering allows "steering" diffusion-based VLAs with high-level actions, for example from VLM reasoning. This also lets us run RL in the diffusion noise space with exploration guided by high-level reasoning: think through a task, then practice it! 👇

If you want a robot to do something well, you need to know how to talk to it. If you don't, you can learn, with Semantic Action RL! In our paper, Jagdeep Bhatia @ RSS 2026, Andrew Wagenmaker, Will Chen show how RL over VLA prompts enables new tasks and learns blazing fast in the real world!

We can learn a model that provides shaped "process rewards" for robotic RL, that evolves automatically as the policy gets better. This improves performance on benchmarks, and works in the real world! Some fun new work with Raymond Tsao &amp; Andrew Wagenmaker

We finished evaluating π0.7, our new model at Physical Intelligence. What I'm most excited about with π0.7 is that it's starting to show some surprising emergent compositional generalization, being able to both perform complex tasks and learn new tasks just from instructions.

If you have a policy that uses diffusion/flow (e.g. diffusion VLA), you can run RL where the actor chooses the noise, which is then denoised by the policy to produce an action. This method, which we call diffusion steering (DSRL), leads to a remarkably efficient RL method! 🧵👇

A while back Benjie Holson described a set of "Robot Olympics" challenge tasks -- washing a pan, making a peanut butter sandwich, and more. We tried to fine-tune our models at PI to these tasks, and found that we could do most of them. A few highlights below.

One of my favorite results for π*0.6 is this video: 13 hours of making lattes, Americanos, and espresso for folks at our office in San Francisco.

Really excited to share what I've been working on with my colleagues at Physical Intelligence! We've developed a prototype robotic foundation model that can fold laundry, assemble a box, bus a table, and many other things. We've written a paper and blog post about it. 🧵👇

Turns out that vision-language models can control robots too. The secret is to just finetune them to print out the actions (literally, as text). Really excited about our new result, the successor to RT-1. RT-2 is a pre-trained VLM: Short 🧵👇

Real-time inference is a big challenge for VLAs. We’ve been working on a way to amortize inference delays in π0.5. Our new Real-Time Chunking (RTC) method speeds up π0.5 by allowing the robot to “think” while it’s moving, which makes it quite a bit faster! 🧵👇

At Physical Intelligence, we teamed up with Weave Robotics and Ultra to stress-test our models in real-world deployments. This was a really fun collaboration that saw our latest pi06 model running in production at Sea Breeze Cleaners and a real warehouse! More below.

Diffusion models make great images. But can they drive robots? Usually that gets complicated really fast. We figured out how to get a Stable Diffusion model (based on Instruct pix2pix) to drive robotic instruction following. Simple recipe, works on a wide range of tasks. Thread👇

Very happy to announce that we are open-sourcing the π₀ model, weights, and some fine-tuned checkpoints! Hoping this leads to lots of great follow-up research: Here is a fun test from our friends at UPenn.

Watch this robot dog learn to walk from scratch in real time! Our new method, APRL, dynamically adjusts exploration constraints to enable fast and performant RL directly in the real world. APRL can also adapt to changes in the terrain. No simulation, no demos. A thread 👇

Language following is a tough problem for VLAs: while these models can follow complex language, in practice getting datasets that enable language following is hard. We developed a method to counterfactually and automatically label data to improve language following! 🧵👇

Benjie posted a reply to the Physical Intelligence "Olympics" attempt: Some nice discussion about what makes a task hard, how current learning methods should change how people think about robotic capability, and some interesting commentary.

We trained a robotic foundation model that can drive mobile robots in six different countries, and navigate Sproul Plaza in midday on the UC Berkeley campus! Some cool new work w/ noriaki_hirose, Lydia Ignatova, Kyle Stachowicz, Catherine Glossop, Dhruv Shah

Self-supervised representation learning looks a bit like RL. What if we literally use RL as a SSL method for visual representations? Turns out that it works quite well. In new work by Dibya Ghosh, we show how this can be done:

If we train VLAs to respond to diverse multimodal prompts, then we can steer them better: [grasp the carrot]/[move to x,y,z]/[put the carrot on the plate]. With many levels of detail, powerful VLMs can step in and steer the model to success much more often! More below 👇

Scaling laws in deep RL? Turns out that batch size, learning rate, and UTD (update-to-data) for getting the most efficient and scalable deep RL has predictable relationships. Checkout the analysis in new work by Oleg Rybkin &amp; collaborators:

Tmrw(Sat) I'll talk about pi-zero & HIL-SERL at the CoRL workshop on Manipulation in a World of Abundant data at 9 am CET (Jupiter room)! Also leading discussion about cross-embodiment in the X-Embodiment workshop at 9:30 am (Terra) Come find out about cool recent advances!

We can learn a model that provides shaped "process rewards" for robotic RL, that evolves automatically as the policy gets better. This improves performance on benchmarks, and works in the real world! Some fun new work with Raymond Tsao & Andrew Wagenmaker

Scaling laws in deep RL? Turns out that batch size, learning rate, and UTD (update-to-data) for getting the most efficient and scalable deep RL has predictable relationships. Checkout the analysis in new work by Oleg Rybkin & collaborators: