Загрузка видео...

Не удалось загрузить видео

На главную

[SIGGRAPH '25] EVA: Expressive Virtual Avatars from Multi-view Videos Contributions: 1. We introduce EVA, a novel method enabling full-body control with real-time, photo-realistic renderings, robustly handling loose clothing dynamics and various facial expressions. 2. We develop an expressive deformable template that generates a deformable human template mesh and employs...

18,407 просмотров • 1 год назад •via X (Twitter)

Комментарии: 9

Фото профиля MrNeRF
MrNeRF1 год назад

Paper: Project:

Фото профиля Page to Pixel Publishing
Page to Pixel Publishing1 год назад

The Art of Flight is a homage to 80s/90s arcade action shmups with a fresh twist on the genre. Pilot multiple ships at the same time to take on oncoming waves of enemies in this fast paced space shooter. Wishlist on Steam today!

Фото профиля revolver ocelot
revolver ocelot1 год назад

It's good but yet to see it in practical or implemented

Фото профиля MrNeRF
MrNeRF1 год назад

I'm crafting an email newsletter that turns my daily updates into a captivating weekly digest, complete with exclusive content. Although it's not live yet, you can sign up now! If you're curious, visit my website and join the subscriber list today!

Фото профиля Marc Habermann
Marc Habermann1 год назад

We will soon release source code and data :) concerning skeleton and mesh: our representation comes with both.

Фото профиля MrNeRF
MrNeRF1 год назад

Author's thread:

Фото профиля темпдельтавелю
темпдельтавелю1 год назад

Looks like this impossible to control . How difficult will be to convert to mesh + pose skeleton ?

Фото профиля Rajendra Singh
Rajendra Singh1 год назад

Wow, this is phenomenal

Фото профиля Bobby Rajesh Malhotra ⁴⁴⁴
Bobby Rajesh Malhotra ⁴⁴⁴1 год назад

A real eye opener

Похожие видео

Want to create an avatar from a single image? FlexAvatar is a transformer model that creates full 360°, high-quality, and expressive 3D head avatar from just a single portrait image in minutes. Real-time Demo: FlexAvatar's lightweight architecture allows both animation and rendering in real-time, enabling interactive user experiences. To create a new 3D head avatar, only one image is required, e.g., from a webcam. The final avatar is ready after 2 minutes. Architecture: Under the hood, FlexAvatar adopts a transformer-based encoder-decoder design. The encoder maps the input image onto a latent avatar space, while the decoder produces 3D Gaussian attribute maps by incorporating the animation signal via cross-attention. The model learns all facial animations directly from the data without relying on pre-built 3D face models. This equips the avatars with realistic facial expressions. The internal avatar latent space can be conveniently used to integrate additional observations of a person via fitting. This enables use-cases where more than one image of a person is available, e.g., from a phone scan of the person. We train jointly on 2D monocular videos and multi-view data. However, in monocular videos, the animation signal leaks the target viewpoint, causing the model to produce incomplete 3D heads. We call this phenomenon entanglement of driving signal and target viewpoint. To prevent entanglement, we introduce bias sinks. These are learnable tokens that indicate whether a training sample stems from a monocular or a multi-view dataset. During training, the model learns to produce incomplete 3D heads only when the monocular token is present. During inference, FlexAvatar then always uses the multi-view token for which the model has learned to produce complete 3D heads. This simple design allows to combine the generalizability from monocular data with the quality of multi-view data. FlexAvatar summary: - Input: Single-image, phone scan, or monocular video - Output: Full 360° head avatar - Expressive animations - Real-time rendering and animation - Generalization to any portrait - Create a new avatar in 2 minutes - Use bias sinks to combine 2D and 3D data 🏠 🌍 🎥 Great work by Tobias Kirschstein and Simon Giebenhain!

Matthias Niessner

95,311 просмотров • 6 месяцев назад

Nvidia announces GAvatar: Animatable 3D Gaussian Avatars with Implicit Mesh Learning paper page: Gaussian splatting has emerged as a powerful 3D representation that harnesses the advantages of both explicit (mesh) and implicit (NeRF) 3D representations. In this paper, we seek to leverage Gaussian splatting to generate realistic animatable avatars from textual descriptions, addressing the limitations (e.g., flexibility and efficiency) imposed by mesh or NeRF-based representations. However, a naive application of Gaussian splatting cannot generate high-quality animatable avatars and suffers from learning instability; it also cannot capture fine avatar geometries and often leads to degenerate body parts. To tackle these problems, we first propose a primitive-based 3D Gaussian representation where Gaussians are defined inside pose-driven primitives to facilitate animation. Second, to stabilize and amortize the learning of millions of Gaussians, we propose to use neural implicit fields to predict the Gaussian attributes (e.g., colors). Finally, to capture fine avatar geometries and extract detailed meshes, we propose a novel SDF-based implicit mesh learning approach for 3D Gaussians that regularizes the underlying geometries and extracts highly detailed textured meshes. Our proposed method, GAvatar, enables the large-scale generation of diverse animatable avatars using only text prompts. GAvatar significantly surpasses existing methods in terms of both appearance and geometry quality, and achieves extremely fast rendering (100 fps) at 1K resolution.

AK

140,960 просмотров • 2 лет назад