Loading video...
Video Failed to Load
Our model can now learn from its own experience with RL! Our new π*0.6 model can more than double throughput over a base model trained without RL, and can perform real-world tasks: making espresso drinks, folding diverse laundry, and assembling boxes. More in the thread below.
704,626 views • 7 months ago •via X (Twitter)
0 Comments
No comments available
Comments from the original post will appear here
