
Sergey Levine
@svlevine • 128,158 subscribers
Associate Professor at UC Berkeley Co-founder, Physical Intelligence
Shorts
Videos

We finished evaluating π0.7, our new model at Physical Intelligence. What I'm most excited about with π0.7 is that it's starting to show some surprising emergent compositional generalization, being able to both perform complex tasks and learn new tasks just from instructions.
Sergey Levine59,885 Aufrufe • vor 1 Monat

A while back Benjie Holson described a set of "Robot Olympics" challenge tasks -- washing a pan, making a peanut butter sandwich, and more. We tried to fine-tune our models at PI to these tasks, and found that we could do most of them. A few highlights below.
Sergey Levine81,204 Aufrufe • vor 5 Monaten

If you have a policy that uses diffusion/flow (e.g. diffusion VLA), you can run RL where the actor chooses the noise, which is then denoised by the policy to produce an action. This method, which we call diffusion steering (DSRL), leads to a remarkably efficient RL method! 🧵👇
Sergey Levine151,888 Aufrufe • vor 11 Monaten

At Physical Intelligence, we teamed up with Weave Robotics and Ultra to stress-test our models in real-world deployments. This was a really fun collaboration that saw our latest pi06 model running in production at Sea Breeze Cleaners and a real warehouse! More below.
Sergey Levine31,910 Aufrufe • vor 3 Monaten

Really excited to share what I've been working on with my colleagues at Physical Intelligence! We've developed a prototype robotic foundation model that can fold laundry, assemble a box, bus a table, and many other things. We've written a paper and blog post about it. 🧵👇
Sergey Levine114,931 Aufrufe • vor 1 Jahr

Real-time inference is a big challenge for VLAs. We’ve been working on a way to amortize inference delays in π0.5. Our new Real-Time Chunking (RTC) method speeds up π0.5 by allowing the robot to “think” while it’s moving, which makes it quite a bit faster! 🧵👇
Sergey Levine73,492 Aufrufe • vor 11 Monaten

Diffusion models make great images. But can they drive robots? Usually that gets complicated really fast. We figured out how to get a Stable Diffusion model (based on Instruct pix2pix) to drive robotic instruction following. Simple recipe, works on a wide range of tasks. Thread👇
Sergey Levine126,519 Aufrufe • vor 2 Jahren

If we train VLAs to respond to diverse multimodal prompts, then we can steer them better: [grasp the carrot]/[move to x,y,z]/[put the carrot on the plate]. With many levels of detail, powerful VLMs can step in and steer the model to success much more often! More below 👇
Sergey Levine20,933 Aufrufe • vor 3 Monaten

Language following is a tough problem for VLAs: while these models can follow complex language, in practice getting datasets that enable language following is hard. We developed a method to counterfactually and automatically label data to improve language following! 🧵👇
Sergey Levine44,176 Aufrufe • vor 9 Monaten

Watch this robot dog learn to walk from scratch in real time! Our new method, APRL, dynamically adjusts exploration constraints to enable fast and performant RL directly in the real world. APRL can also adapt to changes in the terrain. No simulation, no demos. A thread 👇
Sergey Levine105,568 Aufrufe • vor 2 Jahren

RL in the real world presents some big challenges, but also some really big opportunities. In our new work, HIL-SERL, Charles Xu, Jeffrey Wu, Jianlan Luo show that real-world RL can learn a huge range of precise and robust tasks, and perform them much faster than imitation.
Sergey Levine36,489 Aufrufe • vor 1 Jahr