Sergey Levine's banner

Sergey Levine

@svlevine • 131,426 subscribers

Associate Professor at UC Berkeley Co-founder, Physical Intelligence

Shorts

Tmrw(Sat) I'll talk about pi-zero & HIL-SERL at the CoRL workshop on Manipulation in a World of Abundant data at 9 am CET (Jupiter room)! Also leading discussion about cross-embodiment in the X-Embodiment workshop at 9:30 am (Terra) Come find out about cool recent advances!

Tmrw(Sat) I'll talk about pi-zero & HIL-SERL at the CoRL workshop on Manipulation in a World of Abundant data at 9 am CET (Jupiter room)! Also leading discussion about cross-embodiment in the X-Embodiment workshop at 9:30 am (Terra) Come find out about cool recent advances!

40,525 просмотров

Videos

Anya Rossi

sweetdream.ai

SweetDream.ai•Sponsored•Livecam

Watch Anya Live

Anya is streaming live right now! Join her private show and enjoy exclusive content.

Exclusive private shows

1.2k viewers online

Private Show

Join now for exclusive access

Free preview available • Premium content

Flow reversal steering allows "steering" diffusion-based VLAs with high-level actions, for example from VLM reasoning. This also lets us run RL in the diffusion noise space with exploration guided by high-level reasoning: think through a task, then practice it! 👇

Flow reversal steering allows "steering" diffusion-based VLAs with high-level actions, for example from VLM reasoning. This also lets us run RL in the diffusion noise space with exploration guided by high-level reasoning: think through a task, then practice it! 👇

74,382 просмотров • 1 месяц назад

If you want a robot to do something well, you need to know how to talk to it. If you don't, you can learn, with Semantic Action RL! In our paper, Jagdeep Bhatia @ RSS 2026, Andrew Wagenmaker, Will Chen show how RL over VLA prompts enables new tasks and learns blazing fast in the real world!

If you want a robot to do something well, you need to know how to talk to it. If you don't, you can learn, with Semantic Action RL! In our paper, Jagdeep Bhatia @ RSS 2026, Andrew Wagenmaker, Will Chen show how RL over VLA prompts enables new tasks and learns blazing fast in the real world!

29,139 просмотров • 16 дней назад

We can learn a model that provides shaped "process rewards" for robotic RL, that evolves automatically as the policy gets better. This improves performance on benchmarks, and works in the real world! Some fun new work with Raymond Tsao & Andrew Wagenmaker

We can learn a model that provides shaped "process rewards" for robotic RL, that evolves automatically as the policy gets better. This improves performance on benchmarks, and works in the real world! Some fun new work with Raymond Tsao & Andrew Wagenmaker

36,440 просмотров • 23 дней назад

We finished evaluating π0.7, our new model at Physical Intelligence. What I'm most excited about with π0.7 is that it's starting to show some surprising emergent compositional generalization, being able to both perform complex tasks and learn new tasks just from instructions.

We finished evaluating π0.7, our new model at Physical Intelligence. What I'm most excited about with π0.7 is that it's starting to show some surprising emergent compositional generalization, being able to both perform complex tasks and learn new tasks just from instructions.

60,722 просмотров • 3 месяцев назад

If you have a policy that uses diffusion/flow (e.g. diffusion VLA), you can run RL where the actor chooses the noise, which is then denoised by the policy to produce an action. This method, which we call diffusion steering (DSRL), leads to a remarkably efficient RL method! 🧵👇

If you have a policy that uses diffusion/flow (e.g. diffusion VLA), you can run RL where the actor chooses the noise, which is then denoised by the policy to produce an action. This method, which we call diffusion steering (DSRL), leads to a remarkably efficient RL method! 🧵👇

152,824 просмотров • 1 год назад

A while back Benjie Holson described a set of "Robot Olympics" challenge tasks -- washing a pan, making a peanut butter sandwich, and more. We tried to fine-tune our models at PI to these tasks, and found that we could do most of them. A few highlights below.

A while back Benjie Holson described a set of "Robot Olympics" challenge tasks -- washing a pan, making a peanut butter sandwich, and more. We tried to fine-tune our models at PI to these tasks, and found that we could do most of them. A few highlights below.

81,283 просмотров • 6 месяцев назад

One of my favorite results for π*0.6 is this video: 13 hours of making lattes, Americanos, and espresso for folks at our office in San Francisco.

One of my favorite results for π*0.6 is this video: 13 hours of making lattes, Americanos, and espresso for folks at our office in San Francisco.

57,905 просмотров • 8 месяцев назад

Really excited to share what I've been working on with my colleagues at Physical Intelligence! We've developed a prototype robotic foundation model that can fold laundry, assemble a box, bus a table, and many other things. We've written a paper and blog post about it. 🧵👇

Really excited to share what I've been working on with my colleagues at Physical Intelligence! We've developed a prototype robotic foundation model that can fold laundry, assemble a box, bus a table, and many other things. We've written a paper and blog post about it. 🧵👇

114,997 просмотров • 1 год назад

Turns out that vision-language models can control robots too. The secret is to just finetune them to print out the actions (literally, as text). Really excited about our new result, the successor to RT-1. RT-2 is a pre-trained VLM: Short 🧵👇

Turns out that vision-language models can control robots too. The secret is to just finetune them to print out the actions (literally, as text). Really excited about our new result, the successor to RT-1. RT-2 is a pre-trained VLM: Short 🧵👇

165,052 просмотров • 3 лет назад

Real-time inference is a big challenge for VLAs. We’ve been working on a way to amortize inference delays in π0.5. Our new Real-Time Chunking (RTC) method speeds up π0.5 by allowing the robot to “think” while it’s moving, which makes it quite a bit faster! 🧵👇

Real-time inference is a big challenge for VLAs. We’ve been working on a way to amortize inference delays in π0.5. Our new Real-Time Chunking (RTC) method speeds up π0.5 by allowing the robot to “think” while it’s moving, which makes it quite a bit faster! 🧵👇

73,688 просмотров • 1 год назад

At Physical Intelligence, we teamed up with Weave Robotics and Ultra to stress-test our models in real-world deployments. This was a really fun collaboration that saw our latest pi06 model running in production at Sea Breeze Cleaners and a real warehouse! More below.

At Physical Intelligence, we teamed up with Weave Robotics and Ultra to stress-test our models in real-world deployments. This was a really fun collaboration that saw our latest pi06 model running in production at Sea Breeze Cleaners and a real warehouse! More below.

31,910 просмотров • 4 месяцев назад

Diffusion models make great images. But can they drive robots? Usually that gets complicated really fast. We figured out how to get a Stable Diffusion model (based on Instruct pix2pix) to drive robotic instruction following. Simple recipe, works on a wide range of tasks. Thread👇

Diffusion models make great images. But can they drive robots? Usually that gets complicated really fast. We figured out how to get a Stable Diffusion model (based on Instruct pix2pix) to drive robotic instruction following. Simple recipe, works on a wide range of tasks. Thread👇

126,523 просмотров • 2 лет назад

Very happy to announce that we are open-sourcing the π₀ model, weights, and some fine-tuned checkpoints! Hoping this leads to lots of great follow-up research: Here is a fun test from our friends at UPenn.

Very happy to announce that we are open-sourcing the π₀ model, weights, and some fine-tuned checkpoints! Hoping this leads to lots of great follow-up research: Here is a fun test from our friends at UPenn.

72,471 просмотров • 1 год назад

Watch this robot dog learn to walk from scratch in real time! Our new method, APRL, dynamically adjusts exploration constraints to enable fast and performant RL directly in the real world. APRL can also adapt to changes in the terrain. No simulation, no demos. A thread 👇

Watch this robot dog learn to walk from scratch in real time! Our new method, APRL, dynamically adjusts exploration constraints to enable fast and performant RL directly in the real world. APRL can also adapt to changes in the terrain. No simulation, no demos. A thread 👇

105,579 просмотров • 2 лет назад

Language following is a tough problem for VLAs: while these models can follow complex language, in practice getting datasets that enable language following is hard. We developed a method to counterfactually and automatically label data to improve language following! 🧵👇

Language following is a tough problem for VLAs: while these models can follow complex language, in practice getting datasets that enable language following is hard. We developed a method to counterfactually and automatically label data to improve language following! 🧵👇

44,229 просмотров • 11 месяцев назад

Benjie posted a reply to the Physical Intelligence "Olympics" attempt: Some nice discussion about what makes a task hard, how current learning methods should change how people think about robotic capability, and some interesting commentary.

Benjie posted a reply to the Physical Intelligence "Olympics" attempt: Some nice discussion about what makes a task hard, how current learning methods should change how people think about robotic capability, and some interesting commentary.

27,242 просмотров • 5 месяцев назад

We trained a robotic foundation model that can drive mobile robots in six different countries, and navigate Sproul Plaza in midday on the UC Berkeley campus! Some cool new work w/ noriaki_hirose, Lydia Ignatova, Kyle Stachowicz, Catherine Glossop, Dhruv Shah

We trained a robotic foundation model that can drive mobile robots in six different countries, and navigate Sproul Plaza in midday on the UC Berkeley campus! Some cool new work w/ noriaki_hirose, Lydia Ignatova, Kyle Stachowicz, Catherine Glossop, Dhruv Shah

52,979 просмотров • 1 год назад

Self-supervised representation learning looks a bit like RL. What if we literally use RL as a SSL method for visual representations? Turns out that it works quite well. In new work by Dibya Ghosh, we show how this can be done:

Self-supervised representation learning looks a bit like RL. What if we literally use RL as a SSL method for visual representations? Turns out that it works quite well. In new work by Dibya Ghosh, we show how this can be done:

48,747 просмотров • 1 год назад

If we train VLAs to respond to diverse multimodal prompts, then we can steer them better: [grasp the carrot]/[move to x,y,z]/[put the carrot on the plate]. With many levels of detail, powerful VLMs can step in and steer the model to success much more often! More below 👇

If we train VLAs to respond to diverse multimodal prompts, then we can steer them better: [grasp the carrot]/[move to x,y,z]/[put the carrot on the plate]. With many levels of detail, powerful VLMs can step in and steer the model to success much more often! More below 👇

21,049 просмотров • 5 месяцев назад

Scaling laws in deep RL? Turns out that batch size, learning rate, and UTD (update-to-data) for getting the most efficient and scalable deep RL has predictable relationships. Checkout the analysis in new work by Oleg Rybkin & collaborators:

Scaling laws in deep RL? Turns out that batch size, learning rate, and UTD (update-to-data) for getting the most efficient and scalable deep RL has predictable relationships. Checkout the analysis in new work by Oleg Rybkin & collaborators:

43,464 просмотров • 1 год назад