Sergey Levine's banner

Sergey Levine

@svlevine • 131,426 subscribers

Associate Professor at UC Berkeley Co-founder, Physical Intelligence

Shorts

Tmrw(Sat) I'll talk about pi-zero & HIL-SERL at the CoRL workshop on Manipulation in a World of Abundant data at 9 am CET (Jupiter room)! Also leading discussion about cross-embodiment in the X-Embodiment workshop at 9:30 am (Terra) Come find out about cool recent advances!

Tmrw(Sat) I'll talk about pi-zero & HIL-SERL at the CoRL workshop on Manipulation in a World of Abundant data at 9 am CET (Jupiter room)! Also leading discussion about cross-embodiment in the X-Embodiment workshop at 9:30 am (Terra) Come find out about cool recent advances!

40,525 次观看

Videos

Anya Rossi

sweetdream.ai

SweetDream.ai•Sponsored•Livecam

Watch Anya Live

Anya is streaming live right now! Join her private show and enjoy exclusive content.

Exclusive private shows

1.2k viewers online

Private Show

Join now for exclusive access

Free preview available • Premium content

Flow reversal steering allows "steering" diffusion-based VLAs with high-level actions, for example from VLM reasoning. This also lets us run RL in the diffusion noise space with exploration guided by high-level reasoning: think through a task, then practice it! 👇

Flow reversal steering allows "steering" diffusion-based VLAs with high-level actions, for example from VLM reasoning. This also lets us run RL in the diffusion noise space with exploration guided by high-level reasoning: think through a task, then practice it! 👇

74,382 次观看 • 1 个月前

If you want a robot to do something well, you need to know how to talk to it. If you don't, you can learn, with Semantic Action RL! In our paper, Jagdeep Bhatia @ RSS 2026, Andrew Wagenmaker, Will Chen show how RL over VLA prompts enables new tasks and learns blazing fast in the real world!

If you want a robot to do something well, you need to know how to talk to it. If you don't, you can learn, with Semantic Action RL! In our paper, Jagdeep Bhatia @ RSS 2026, Andrew Wagenmaker, Will Chen show how RL over VLA prompts enables new tasks and learns blazing fast in the real world!

29,139 次观看 • 16 天前

We can learn a model that provides shaped "process rewards" for robotic RL, that evolves automatically as the policy gets better. This improves performance on benchmarks, and works in the real world! Some fun new work with Raymond Tsao & Andrew Wagenmaker

We can learn a model that provides shaped "process rewards" for robotic RL, that evolves automatically as the policy gets better. This improves performance on benchmarks, and works in the real world! Some fun new work with Raymond Tsao & Andrew Wagenmaker

36,440 次观看 • 23 天前

We finished evaluating π0.7, our new model at Physical Intelligence. What I'm most excited about with π0.7 is that it's starting to show some surprising emergent compositional generalization, being able to both perform complex tasks and learn new tasks just from instructions.

We finished evaluating π0.7, our new model at Physical Intelligence. What I'm most excited about with π0.7 is that it's starting to show some surprising emergent compositional generalization, being able to both perform complex tasks and learn new tasks just from instructions.

60,722 次观看 • 3 个月前

If you have a policy that uses diffusion/flow (e.g. diffusion VLA), you can run RL where the actor chooses the noise, which is then denoised by the policy to produce an action. This method, which we call diffusion steering (DSRL), leads to a remarkably efficient RL method! 🧵👇

If you have a policy that uses diffusion/flow (e.g. diffusion VLA), you can run RL where the actor chooses the noise, which is then denoised by the policy to produce an action. This method, which we call diffusion steering (DSRL), leads to a remarkably efficient RL method! 🧵👇

152,824 次观看 • 1 年前

A while back Benjie Holson described a set of "Robot Olympics" challenge tasks -- washing a pan, making a peanut butter sandwich, and more. We tried to fine-tune our models at PI to these tasks, and found that we could do most of them. A few highlights below.

A while back Benjie Holson described a set of "Robot Olympics" challenge tasks -- washing a pan, making a peanut butter sandwich, and more. We tried to fine-tune our models at PI to these tasks, and found that we could do most of them. A few highlights below.

81,283 次观看 • 6 个月前

One of my favorite results for π*0.6 is this video: 13 hours of making lattes, Americanos, and espresso for folks at our office in San Francisco.

One of my favorite results for π*0.6 is this video: 13 hours of making lattes, Americanos, and espresso for folks at our office in San Francisco.

57,905 次观看 • 8 个月前

Really excited to share what I've been working on with my colleagues at Physical Intelligence! We've developed a prototype robotic foundation model that can fold laundry, assemble a box, bus a table, and many other things. We've written a paper and blog post about it. 🧵👇

Really excited to share what I've been working on with my colleagues at Physical Intelligence! We've developed a prototype robotic foundation model that can fold laundry, assemble a box, bus a table, and many other things. We've written a paper and blog post about it. 🧵👇

114,997 次观看 • 1 年前

Turns out that vision-language models can control robots too. The secret is to just finetune them to print out the actions (literally, as text). Really excited about our new result, the successor to RT-1. RT-2 is a pre-trained VLM: Short 🧵👇

Turns out that vision-language models can control robots too. The secret is to just finetune them to print out the actions (literally, as text). Really excited about our new result, the successor to RT-1. RT-2 is a pre-trained VLM: Short 🧵👇

165,052 次观看 • 3 年前

Real-time inference is a big challenge for VLAs. We’ve been working on a way to amortize inference delays in π0.5. Our new Real-Time Chunking (RTC) method speeds up π0.5 by allowing the robot to “think” while it’s moving, which makes it quite a bit faster! 🧵👇

Real-time inference is a big challenge for VLAs. We’ve been working on a way to amortize inference delays in π0.5. Our new Real-Time Chunking (RTC) method speeds up π0.5 by allowing the robot to “think” while it’s moving, which makes it quite a bit faster! 🧵👇

73,688 次观看 • 1 年前

At Physical Intelligence, we teamed up with Weave Robotics and Ultra to stress-test our models in real-world deployments. This was a really fun collaboration that saw our latest pi06 model running in production at Sea Breeze Cleaners and a real warehouse! More below.

At Physical Intelligence, we teamed up with Weave Robotics and Ultra to stress-test our models in real-world deployments. This was a really fun collaboration that saw our latest pi06 model running in production at Sea Breeze Cleaners and a real warehouse! More below.

31,910 次观看 • 4 个月前

Diffusion models make great images. But can they drive robots? Usually that gets complicated really fast. We figured out how to get a Stable Diffusion model (based on Instruct pix2pix) to drive robotic instruction following. Simple recipe, works on a wide range of tasks. Thread👇

Diffusion models make great images. But can they drive robots? Usually that gets complicated really fast. We figured out how to get a Stable Diffusion model (based on Instruct pix2pix) to drive robotic instruction following. Simple recipe, works on a wide range of tasks. Thread👇

126,523 次观看 • 2 年前

Very happy to announce that we are open-sourcing the π₀ model, weights, and some fine-tuned checkpoints! Hoping this leads to lots of great follow-up research: Here is a fun test from our friends at UPenn.

Very happy to announce that we are open-sourcing the π₀ model, weights, and some fine-tuned checkpoints! Hoping this leads to lots of great follow-up research: Here is a fun test from our friends at UPenn.

72,471 次观看 • 1 年前

Watch this robot dog learn to walk from scratch in real time! Our new method, APRL, dynamically adjusts exploration constraints to enable fast and performant RL directly in the real world. APRL can also adapt to changes in the terrain. No simulation, no demos. A thread 👇

Watch this robot dog learn to walk from scratch in real time! Our new method, APRL, dynamically adjusts exploration constraints to enable fast and performant RL directly in the real world. APRL can also adapt to changes in the terrain. No simulation, no demos. A thread 👇

105,579 次观看 • 2 年前

Language following is a tough problem for VLAs: while these models can follow complex language, in practice getting datasets that enable language following is hard. We developed a method to counterfactually and automatically label data to improve language following! 🧵👇

Language following is a tough problem for VLAs: while these models can follow complex language, in practice getting datasets that enable language following is hard. We developed a method to counterfactually and automatically label data to improve language following! 🧵👇

44,229 次观看 • 11 个月前

Benjie posted a reply to the Physical Intelligence "Olympics" attempt: Some nice discussion about what makes a task hard, how current learning methods should change how people think about robotic capability, and some interesting commentary.

Benjie posted a reply to the Physical Intelligence "Olympics" attempt: Some nice discussion about what makes a task hard, how current learning methods should change how people think about robotic capability, and some interesting commentary.

27,242 次观看 • 5 个月前

We trained a robotic foundation model that can drive mobile robots in six different countries, and navigate Sproul Plaza in midday on the UC Berkeley campus! Some cool new work w/ noriaki_hirose, Lydia Ignatova, Kyle Stachowicz, Catherine Glossop, Dhruv Shah

We trained a robotic foundation model that can drive mobile robots in six different countries, and navigate Sproul Plaza in midday on the UC Berkeley campus! Some cool new work w/ noriaki_hirose, Lydia Ignatova, Kyle Stachowicz, Catherine Glossop, Dhruv Shah

52,979 次观看 • 1 年前

Self-supervised representation learning looks a bit like RL. What if we literally use RL as a SSL method for visual representations? Turns out that it works quite well. In new work by Dibya Ghosh, we show how this can be done:

Self-supervised representation learning looks a bit like RL. What if we literally use RL as a SSL method for visual representations? Turns out that it works quite well. In new work by Dibya Ghosh, we show how this can be done:

48,747 次观看 • 1 年前

If we train VLAs to respond to diverse multimodal prompts, then we can steer them better: [grasp the carrot]/[move to x,y,z]/[put the carrot on the plate]. With many levels of detail, powerful VLMs can step in and steer the model to success much more often! More below 👇

If we train VLAs to respond to diverse multimodal prompts, then we can steer them better: [grasp the carrot]/[move to x,y,z]/[put the carrot on the plate]. With many levels of detail, powerful VLMs can step in and steer the model to success much more often! More below 👇

21,049 次观看 • 5 个月前

Scaling laws in deep RL? Turns out that batch size, learning rate, and UTD (update-to-data) for getting the most efficient and scalable deep RL has predictable relationships. Checkout the analysis in new work by Oleg Rybkin & collaborators:

Scaling laws in deep RL? Turns out that batch size, learning rate, and UTD (update-to-data) for getting the most efficient and scalable deep RL has predictable relationships. Checkout the analysis in new work by Oleg Rybkin & collaborators:

43,464 次观看 • 1 年前