Video wird geladen...

Video konnte nicht geladen werden

Zur Startseite

4DGT: Learning a 4D Gaussian Transformer Using Real-World Monocular Videos Abstract: We propose 4DGT, a 4D Gaussian-based Transformer model for dynamic scene reconstruction, trained entirely on real-world monocular posed videos. Using 4D Gaussian as an inductive bias, 4DGT unifies static and dynamic components, enabling the modeling of complex, time-varying...

34,782 Aufrufe • vor 1 Jahr •via X (Twitter)

11 Kommentare

Profilbild von MrNeRF
MrNeRFvor 1 Jahr

Paper: not yet Project: "4DGT takes a series of monocular frames with poses as input. During training, we subsample the temporal frames at different granularity and use all images for supervision. In stage one, we train 4DGT to predict pixel-aligned Gaussians at coarse resolution. In stage two, we prune a majority of non-activated Gaussians based on the histograms of per-patch activation channels and densify the Gaussian prediction by increasing the input token samples in both space and time. At inference time, we run the 4DGT network trained after stage two, which supports dense video frames input at high resolution."

Profilbild von MrNeRF
MrNeRFvor 1 Jahr

Paper:

Profilbild von Pablo Vela
Pablo Velavor 1 Jahr

Wow this looks really sick

Profilbild von MrNeRF
MrNeRFvor 1 Jahr

Yeah, and the clip is super long.

Profilbild von Micky Abir
Micky Abirvor 1 Jahr

people don’t realize how huge this is

Profilbild von MrNeRF
MrNeRFvor 1 Jahr

long long videos, yeah!

Profilbild von TessyVFXR
TessyVFXRvor 1 Jahr

The fact that I can't get my head off this for the past few days... For me, it is that much needed tool that unlocks a lot.

Profilbild von James | 🤖
James | 🤖vor 1 Jahr

Awesome. Looking forward to trying this out!

Profilbild von MrNeRF
MrNeRFvor 1 Jahr

I'm crafting an email newsletter that turns my daily updates into a captivating weekly digest, complete with exclusive content. Although it's not live yet, you can sign up now! If you're curious, visit my website and join the subscriber list today!

Profilbild von Mars (parody)
Mars (parody)vor 1 Jahr

the future is beaming into reality gaaah this is so exciting

Profilbild von MrNeRF
MrNeRFvor 1 Jahr

Pretty good for monocular footage. The videos are also very long!

Ähnliche Videos

[SIGGRAPH 2025] Photoreal Scene Reconstruction from an Egocentric Device Contributions: 1. We address the importance of employing visual-inertial bundle adjustment (VIBA) that accounts for the rolling-shutter behavior of the RGB camera. This provides a continuous camera trajectory to model pixel movement in neural reconstruction. Our experiments demonstrate that using VIBA consistently improves the novel view quality in Gaussian Splatting by +1 dB in PSNR. 2. We introduce a rasterization-based image formulation pipeline that addresses common artifacts in physical image formation, including rolling shutter, lens shading, exposure, and gain compensation. Our approach is distinct in that we represent image poses as posed pixel arrays sampled from a continuous trajectory, rather than assigning a single camera pose per image, and preserve the merit of Gaussian rasterization. Unlike existing methods that require ray-tracing Gaussians, e.g., [Moenne-Loccoz et al. 2024], our formulation is applicable to general-purpose rasterization-based Gaussian splatting. When applied to 3D Gaussian Splatting (3DGS) [Kerbl et al. 2023], our approach can further enhance reconstruction quality by +1 dB. We outperform existing baselines and demonstrate a substantial quality improvement in handling complex scenes observed by egocentric devices. 3. To reduce the effect of blur from rapid head motion in darker indoor scenes, we propose a strategy of deliberately underexposing input videos during capture, inspired by HDR+ [Hasinoff et al. 2016]. We demonstrate that we can reconstruct high-quality, noise-free scene radiance from noisy, dim input videos, and further render sharp, blur-free videos at a higher dynamic range.

MrNeRF

15,244 Aufrufe • vor 1 Jahr