Loading video...
Video Failed to Load
Estimating a full-body pose from only the headset and controllers is ambiguous. Environment information can help resolve this. In our #SIGGRAPH2023 paper the avatar is generated from only the 3 shown coordinate frames (no cameras) + height of the environment (green dots). (1/4)👇
200,255 views • 3 years ago •via X (Twitter)
10 Comments

The avatar is trained with Reinforcement Learning to imitate hours of typical motions (e.g. getting up from the floor, sitting on chairs) knowing the terrain height. Then during inference it is able to generate the appropriate torques to track similar but unseen motions. (2/4)

Constraints of the physics simulation achieve these natural lower-body poses (e.g. center-of-mass stability, no object or floor penetration). What's cool is how the avatar also learned to manipulate objects (e.g. tilt the simulated chair) to better follow the input signal. (3/4)

This work was led by Sunmin Lee who has a more detailed twitter thread here: Authors: @sunnyCodes_ , @blacksquirrel__ , Yuting Ye, Jungdam Won, myself Project page: (4/4)

The legs estimation leaving me completely shocked

@VarunMayya @ArpanLokhande @amitkvermaxd @Ridhi_sam_11

@I3Llamas

A fantastic and practical result!

@vr_rames @EricdeBrocart

Great work! I notice that shoulder movement, which should be easy to model, is usually ignored and the error breaks immersion in most VR userbody models. How accurate do you think you can make it, if you use the Quest2 cameras to detect shoulders and knees?

@cgonfire This is it: mocap for VR headset users

