Loading video...

Video Failed to Load

Go Home

Trying physics-driven animation with two ragdolls. Both the local VR user avatar and the NPC avatar (which represents another user) are physically simulated This is just a start, still a lot of work to do

32,460 views • 1 year ago •via X (Twitter)

7 Comments

Spiritmarsrover's profile picture
Spiritmarsrover1 year ago

Nice. There is a simple interaction that I've thought about but is never simulated in vr, pushing a waist height drawer closed with your waist. I'd want to see if that works.

MayflowerZero's profile picture
MayflowerZero1 year ago

this looks amaying already, i am so curious to see where this goes! Also can i just say that you are one of the most creative persons i see on my "x" feed? :>

Konshu's profile picture
Konshu1 year ago

silly but cool

Haru Rose's profile picture
Haru Rose1 year ago

What.. what are u doing over there? ... damn that looks fun

Tylyn's profile picture
Tylyn1 year ago

..................woah

Killswig's profile picture
Killswig1 year ago

my one thought about it seeing this, if the player viewport gets moved because of the physics, thats instant nausea for me, main reason i can't play boneworks, if my hands touch something i get lifted up and bam I gotta abort vr

Haï~'s profile picture
Haï~1 year ago

I am not moving the viewport, that's not the experience I'm going for. The plan is for the local avatar to look different in first-person than in cameras; so the head of your own avatar would move in video captures if you get pushed, while it remains immovable in first-person.

Related Videos

Want to create an avatar from a single image? FlexAvatar is a transformer model that creates full 360°, high-quality, and expressive 3D head avatar from just a single portrait image in minutes. Real-time Demo: FlexAvatar's lightweight architecture allows both animation and rendering in real-time, enabling interactive user experiences. To create a new 3D head avatar, only one image is required, e.g., from a webcam. The final avatar is ready after 2 minutes. Architecture: Under the hood, FlexAvatar adopts a transformer-based encoder-decoder design. The encoder maps the input image onto a latent avatar space, while the decoder produces 3D Gaussian attribute maps by incorporating the animation signal via cross-attention. The model learns all facial animations directly from the data without relying on pre-built 3D face models. This equips the avatars with realistic facial expressions. The internal avatar latent space can be conveniently used to integrate additional observations of a person via fitting. This enables use-cases where more than one image of a person is available, e.g., from a phone scan of the person. We train jointly on 2D monocular videos and multi-view data. However, in monocular videos, the animation signal leaks the target viewpoint, causing the model to produce incomplete 3D heads. We call this phenomenon entanglement of driving signal and target viewpoint. To prevent entanglement, we introduce bias sinks. These are learnable tokens that indicate whether a training sample stems from a monocular or a multi-view dataset. During training, the model learns to produce incomplete 3D heads only when the monocular token is present. During inference, FlexAvatar then always uses the multi-view token for which the model has learned to produce complete 3D heads. This simple design allows to combine the generalizability from monocular data with the quality of multi-view data. FlexAvatar summary: - Input: Single-image, phone scan, or monocular video - Output: Full 360° head avatar - Expressive animations - Real-time rendering and animation - Generalization to any portrait - Create a new avatar in 2 minutes - Use bias sinks to combine 2D and 3D data 🏠 🌍 🎥 Great work by Tobias Kirschstein and Simon Giebenhain!

Matthias Niessner

76,454 views • 5 months ago