Video yükleniyor...
Video Yüklenemedi
When I started my first project on in-hand manipulation, I thought it would be super cool but also quite challenging to make my robot hands spin pens. After almost 2.5 years of effort in this line of research, we have finally succeeded in making our robot hand "spin pens."
114,593 görüntüleme • 1 yıl önce •via X (Twitter)
10 Yorum

The first thing we tried was directly do sim2real. However, it does not work for this task. Despite extensive experimentation with hardware design, objects, and input modalities, we did not achieve any success. The object is always easily dropped or completely fails due to distribution shift.

How did we do this? We first train an oracle policy in simulation, which provides high-quality trajectories and action data. We use this data to: 1) train a student policy, and 2) serve as an open-loop controller in the real world to collect successful real trajectories. Finally, we fine-tune the student policy using these real trajectories.

Generating a smooth oracle policy is, in itself, another challenge. We design the reward function to encourage horizontal stability and the initial state distribution to promote exploration. Without these techniques, we would not be able to obtain a good policy in simulation.

Other alternative methods, such as relying solely on real-world replay, pre-training only in simulation, or using sim-to-real distillation with vision, also do not work.

Throughout this process, it made us rethink the role of the simulator and how to properly combine simulation and real-world data. We share our lessons learned as follows: 1. Simulation training requires extensive design for exploration, such as the proper design of initial distributions to aid exploration and using privileged information to facilitate policy learning. 2. Sim-to-Real does not directly work for such contact-rich and highly dynamic tasks. Even when isolating touch and vision, the pure physics sim-to-real gap remains significant and cannot be bridged by extensive domain randomization alone. 3. Simulation is still useful for exploring skills. The dynamic skill of spinning pens with a robotic hand is nearly impossible to achieve with human teleoperation and imitation learning alone. Reinforcement learning in simulation is critical for exploring feasible motion. 4. Only a few real-world trajectories are needed for fine-tuning. Although a proprioceptive policy learned purely in simulation does not work directly in the real world, it can be fine-tuned to adapt to real-world physics using only a few successful trajectories.

Code is also released! Website: This project would never happen without a team with expertise on in-hand manipulation: @Junwang_048, @Ying_yyyyyyyy, Haichuan Che, @YiMaTweets, @JitendraMalikCV, @xiaolonw. Also thanks @berkeley_ai for support!

I really like this 4 fingered hand! Is the design open source or based on any open designs? I’d like to experiment with a 3 or 4 finger manipulator.

wow!

great job!

thank you Anastasios!

