正在加载视频...

视频加载失败

How can robots learn generalizable manipulation skills for diverse objects? Going beyond pick-and-place, our recent work “HACMan” enables complex interactions for unseen objects, such as flipping, pushing, or tilting, using spatial action maps + RL with point clouds. (w/ @MetaAI)

49,846 次观看 • 3 年前 •via X (Twitter)

10 条评论

Wenxuan Zhou 的头像
Wenxuan Zhou3 年前

We find that defining the right action space is crucial for learning a manipulation task. We explore an object-centric action representation in RL that consists of selecting a contact location on the object and a set of parameters describing the robot's movement after contact.

Wenxuan Zhou 的头像
Wenxuan Zhou3 年前

Our object-centric action representation has two benefits. It is… 1. Spatially-grounded: because the learned contact location is selected from the observed object points. 2. Temporally-abstracted: because we focus only on learning the contact-rich portions of the action.

Wenxuan Zhou 的头像
Wenxuan Zhou3 年前

With off-policy RL, given a point cloud, the actor outputs per-point motion parameters (Actor Map) while the critic outputs per-point Q-values (Critic Map). The Critic Map is not only used to update the actor but also serves as the scores for selecting the contact location.

Wenxuan Zhou 的头像
Wenxuan Zhou3 年前

We evaluate our method with a 6D object pose alignment task with randomized initial poses, randomized 6D goals, and diverse unseen objects in both simulation and in the real world.

Wenxuan Zhou 的头像
Wenxuan Zhou3 年前

HACMan outperforms the baselines, with a larger margin for more challenging tasks. Success rates for simple tasks - pushing a single object to an in-plane goal - are high for all methods, but only HACMan achieves high success rates for 6D alignment of diverse objects.

Wenxuan Zhou 的头像
Wenxuan Zhou3 年前

Check out the paper and the website for more information and video results showing HACMan generalizing to different objects and goals! w/@bwww08, Fan Yang, @chris_j_paxton, @davheld

Brett Adcock 的头像
Brett Adcock3 年前

@MetaAI Congrats, thanks for sharing.

Arnav Wadhwa 的头像
Arnav Wadhwa3 年前

@MetaAI Amazing work! I’m wondering about the challenges/improvements tradeoff when using a human-hand like end effector with 5 fingers. Curious to know what you think

Wenxuan Zhou 的头像
Wenxuan Zhou3 年前

@MetaAI Multi-fingered hands may allow a wider variety of motions and have more tolerance (picking an object with a multi-fingered hand can be less sensitive to object shapes than a simple gripper). However, they are more expensive, easier to break, and have a bigger sim2real gap.

Sasha Salter 的头像
Sasha Salter2 年前

@MetaAI Great use of temporal abstraction to simplify learning!

相关视频