Video wird geladen...

Video konnte nicht geladen werden

Beim Laden dieses Videos ist ein Problem aufgetreten. Dies könnte an einem vorübergehenden Netzwerkproblem liegen oder das Video ist möglicherweise nicht verfügbar.

1000+ FPS 4D Gaussian Splatting for Dynamic Scene Rendering Contributions: • We delve into the temporal redundancy of 4D Gaussian Splatting and explain the main reason for the storage pressure and suboptimal rendering speed. • We introduce 4DGS-1K, a compact and memory-efficient framework to address these issues. It consists... show more

MrNeRF

15,861 subscribers

12,200 Aufrufe • vor 1 Jahr •via X (Twitter)

Wissenschaft & Technologie Bildung

Anya Rossi• Live Now

Private livecam show

8 Kommentare

Profilbild von MrNeRF

MrNeRFvor 1 Jahr

Paper: Project:

Profilbild von AndaSeat

AndaSeatvor 1 Jahr

🎨 Freelancer life is like breathing - sometimes fast, sometimes slow... 💫 X-Air Pro flows with your rhythm: 💨 Breathable mesh for those deadline sprints 💭 Adaptive tilt for brainstorming reclines 🔄 5D armrests for device-switching dance ⚡ C-shaped lumbar for your entrepreneurial backbone ✨ Freedom to move, space to create: 💝 Freelancer special: Create your comfort for $20 off! #FreelanceLife #CreateFromHome #WorkspaceGoals #CreativeLife 🎯💻

Profilbild von Infinite-Realities

Infinite-Realitiesvor 1 Jahr

If only these solutions and codebases could handle real world datasets and not just the guy frying a steak!

Profilbild von MrNeRF

MrNeRFvor 1 Jahr

I'm crafting an email newsletter that turns my daily updates into a captivating weekly digest, complete with exclusive content. Although it's not live yet, you can sign up now! If you're curious, visit my website and join the subscriber list today!

Profilbild von Data

Datavor 1 Jahr

My God!

Profilbild von LLMLens

LLMLensvor 1 Jahr

Fascinating leap in rendering speed, but I'm reminded of Virilio's dromology - the logic of speed in technology. As we accelerate towards 1000+ FPS, what cultural shifts might emerge from this hyper-real temporality? How does it reshape our perception of digital materiality?

Profilbild von potat

potatvor 1 Jahr

> We delve 🤣

Profilbild von MrNeRF

MrNeRFvor 1 Jahr

Not that I would know better 😂

Ähnliche Videos

OccluGaussian: Occlusion-Aware Gaussian Splatting for Large Scene Reconstruction and Rendering Contributions: • We propose an occlusion-aware scene division strategy that considers the scene layout and camera co-visibilities. The resulting regions barely contain occlusions, and the corresponding training cameras have a higher average contribution, leading to improved reconstruction results. • We present a region-based rendering technique that accelerates 3D Gaussian splatting in large scenes. It eliminates much of the time-consuming processing of invisible 3D Gaussians, boosting rendering speeds without noticeable quality degradation. • We conduct extensive experiments on several large-scene datasets and demonstrate that OccluGaussian achieves superior rendering quality and faster rendering speed compared to previous state-of-the-art methods.

OccluGaussian: Occlusion-Aware Gaussian Splatting for Large Scene Reconstruction and Rendering Contributions: • We propose an occlusion-aware scene division strategy that considers the scene layout and camera co-visibilities. The resulting regions barely contain occlusions, and the corresponding training cameras have a higher average contribution, leading to improved reconstruction results. • We present a region-based rendering technique that accelerates 3D Gaussian splatting in large scenes. It eliminates much of the time-consuming processing of invisible 3D Gaussians, boosting rendering speeds without noticeable quality degradation. • We conduct extensive experiments on several large-scene datasets and demonstrate that OccluGaussian achieves superior rendering quality and faster rendering speed compared to previous state-of-the-art methods.

MrNeRF

10,718 Aufrufe • vor 1 Jahr

[SIGGRAPH '26] Anchored Temporal Gaussian Splatting for Long Volumetric Video Representation TL;DR: We present ATGS, a novel framework for volumetric video reconstruction that effectively handles long sequences and complex motions. By utilizing time-conditioned anchors and a temporal windowing strategy, ATGS enhances temporal coherence and scalability. Abstract (excerpt): Key insight is that explicitly tracking long term complex motion with individual Gaussian primitives is inherently unstable. Instead, we organize Gaussians around time conditioned anchors that localize their spatial and temporal support, thereby reducing long range motion complexity. We further introduce a temporal windowing strategy to activate only anchors relevant to the queried time, which improves scalability and temporal coherence. In addition, to ensure spatial and temporal stability, we design a compact set of multi level anchor features that encode global features, local spatial features, and local temporal features, jointly constraining Gaussian generation. Extensive experiments demonstrate that ATGS consistently outperforms prior methods on long sequence volumetric videos with complex motions.

[SIGGRAPH '26] Anchored Temporal Gaussian Splatting for Long Volumetric Video Representation TL;DR: We present ATGS, a novel framework for volumetric video reconstruction that effectively handles long sequences and complex motions. By utilizing time-conditioned anchors and a temporal windowing strategy, ATGS enhances temporal coherence and scalability. Abstract (excerpt): Key insight is that explicitly tracking long term complex motion with individual Gaussian primitives is inherently unstable. Instead, we organize Gaussians around time conditioned anchors that localize their spatial and temporal support, thereby reducing long range motion complexity. We further introduce a temporal windowing strategy to activate only anchors relevant to the queried time, which improves scalability and temporal coherence. In addition, to ensure spatial and temporal stability, we design a compact set of multi level anchor features that encode global features, local spatial features, and local temporal features, jointly constraining Gaussian generation. Extensive experiments demonstrate that ATGS consistently outperforms prior methods on long sequence volumetric videos with complex motions.

MrNeRF

26,905 Aufrufe • vor 2 Monaten

[SIGGRAPH Asia '24 (TOG)] Representing Long Volumetric Video with Temporal Gaussian Hierarchy Contributions: • We introduce a novel, efficient, and expressive Temporal Gaussian Hierarchy representation for long volumetric video. To our knowledge, our method is the first approach capable of handling minutes of volumetric video data. • We propose a Compact Appearance Model and a new rasterization implementation to facilitate real-time, high-quality dynamic view synthesis while maintaining a compact size. • We propose a system to efficiently model long volumetric videos for the first time and demonstrate state-of-the-art dynamic view synthesis quality on the Neural3DV [Li et al. 2022], ENeRF-Outdoor [Lin et al. 2022], and MobileStage [Xu et al. 2024b] datasets, while also achieving the best rendering speed with reduced training cost and memory usage.

[SIGGRAPH Asia '24 (TOG)] Representing Long Volumetric Video with Temporal Gaussian Hierarchy Contributions: • We introduce a novel, efficient, and expressive Temporal Gaussian Hierarchy representation for long volumetric video. To our knowledge, our method is the first approach capable of handling minutes of volumetric video data. • We propose a Compact Appearance Model and a new rasterization implementation to facilitate real-time, high-quality dynamic view synthesis while maintaining a compact size. • We propose a system to efficiently model long volumetric videos for the first time and demonstrate state-of-the-art dynamic view synthesis quality on the Neural3DV [Li et al. 2022], ENeRF-Outdoor [Lin et al. 2022], and MobileStage [Xu et al. 2024b] datasets, while also achieving the best rendering speed with reduced training cost and memory usage.

MrNeRF

79,379 Aufrufe • vor 1 Jahr

4D LangSplat: 4D Language Gaussian Splatting via Multimodal Large Language Models Contributions: • We introduce 4D LangSplat for open-vocabulary 4D spatial-temporal queries. To the best of our knowledge, we are the first to construct 4D language fields with object textual captions generated by MLLMs. • To model smooth transitions across states for objects in 4D scenes, we propose a status deformable network to capture continuous temporal changes. • Experiential results show that our method attains state-of-the-art performance for both time-agnostic and time-sensitive open-vocabulary queries.

4D LangSplat: 4D Language Gaussian Splatting via Multimodal Large Language Models Contributions: • We introduce 4D LangSplat for open-vocabulary 4D spatial-temporal queries. To the best of our knowledge, we are the first to construct 4D language fields with object textual captions generated by MLLMs. • To model smooth transitions across states for objects in 4D scenes, we propose a status deformable network to capture continuous temporal changes. • Experiential results show that our method attains state-of-the-art performance for both time-agnostic and time-sensitive open-vocabulary queries.

MrNeRF

10,953 Aufrufe • vor 1 Jahr

RT-Splatting: Joint Reflection-Transmission Modeling with Gaussian Splatting Contributions: • We introduce a unified surface-volume Gaussian scene representation for jointly modeling sharp specular reflections and clear transmission in real-world scenes containing thin semi-transparent surfaces. • We propose Specular-Aware Gradient Gating to suppress misleading gradients from complex specular regions, substantially reducing floaters in the transmission branch. • Extensive experiments demonstrate that RT-Splatting significantly outperforms prior methods while maintaining real-time rendering and enabling flexible scene editing.

RT-Splatting: Joint Reflection-Transmission Modeling with Gaussian Splatting Contributions: • We introduce a unified surface-volume Gaussian scene representation for jointly modeling sharp specular reflections and clear transmission in real-world scenes containing thin semi-transparent surfaces. • We propose Specular-Aware Gradient Gating to suppress misleading gradients from complex specular regions, substantially reducing floaters in the transmission branch. • Extensive experiments demonstrate that RT-Splatting significantly outperforms prior methods while maintaining real-time rendering and enabling flexible scene editing.

MrNeRF

27,917 Aufrufe • vor 1 Monat

[SIGGRAPH ASIA '25] Detail-Enhanced Gaussian Splatting for Large-Scale Volumetric Capture Contributions: - A two-stage approach to performance capture, combining a scene-scale capture rig and a single-actor facial capture rig. - A novel high-quality scene-scale volumetric performance capture rig, incorporating both static and dynamic cameras to track the performance of multiple actors. - A reconstruction pipeline for dynamic performance capture, featuring stable calibration of moving cameras and 4DGS with improved dynamic range and color fidelity. - A detail enhancement Diffusion Model, which supports 4K, RGB, and Alpha, with improved temporal stability.

[SIGGRAPH ASIA '25] Detail-Enhanced Gaussian Splatting for Large-Scale Volumetric Capture Contributions: - A two-stage approach to performance capture, combining a scene-scale capture rig and a single-actor facial capture rig. - A novel high-quality scene-scale volumetric performance capture rig, incorporating both static and dynamic cameras to track the performance of multiple actors. - A reconstruction pipeline for dynamic performance capture, featuring stable calibration of moving cameras and 4DGS with improved dynamic range and color fidelity. - A detail enhancement Diffusion Model, which supports 4K, RGB, and Alpha, with improved temporal stability.

MrNeRF

42,382 Aufrufe • vor 7 Monaten

Whilst we figure out a solution for temporal filtering on 3D Gaussian Splatting input frames. We can use a temporal filtering plugin on the screen captures or rendered images to act as a visualization tool to check what the future results might look like. "Processing and rendering a digital human with 3D Gaussian Splatting with a post process to simulate temporal filtering." Left video no temporal filtering. Right video with temporal filtering. R&D for archviz projects. #GaussianSplatting #ir #inria #sibr #aftereffects #temporalfiltering

Whilst we figure out a solution for temporal filtering on 3D Gaussian Splatting input frames. We can use a temporal filtering plugin on the screen captures or rendered images to act as a visualization tool to check what the future results might look like. "Processing and rendering a digital human with 3D Gaussian Splatting with a post process to simulate temporal filtering." Left video no temporal filtering. Right video with temporal filtering. R&D for archviz projects. #GaussianSplatting #ir #inria #sibr #aftereffects #temporalfiltering

Infinite-Realities

58,368 Aufrufe • vor 2 Jahren

GPS-Gaussian+: Generalizable Pixel-wise 3D Gaussian Splatting for Real-Time Human-Scene Rendering from Sparse Views TL;DR: Are we witnessing the first steps towards 3DGS live streaming? Contributions: • We introduce a generalizable 3D Gaussian Splatting methodology that employs pixel-wise Gaussian parameter maps defined on 2D source image planes to formulate 3D Gaussians in a feed-forward manner. • We propose a fully differentiable framework composed of an iterative depth estimation module and a Gaussian parameter regression module. The intermediate depth prediction bridges the two components and allows them to benefit from joint training. • We introduce a regularization term and an epipolar attention mechanism to preserve geometry consistency between the two source views when using only rendering loss. Our method generalizes well to unseen characters even in complicated scenes. • We develop a real-time FVV system that achieves high-resolution rendering of characters in the scene without any geometry supervision.

GPS-Gaussian+: Generalizable Pixel-wise 3D Gaussian Splatting for Real-Time Human-Scene Rendering from Sparse Views TL;DR: Are we witnessing the first steps towards 3DGS live streaming? Contributions: • We introduce a generalizable 3D Gaussian Splatting methodology that employs pixel-wise Gaussian parameter maps defined on 2D source image planes to formulate 3D Gaussians in a feed-forward manner. • We propose a fully differentiable framework composed of an iterative depth estimation module and a Gaussian parameter regression module. The intermediate depth prediction bridges the two components and allows them to benefit from joint training. • We introduce a regularization term and an epipolar attention mechanism to preserve geometry consistency between the two source views when using only rendering loss. Our method generalizes well to unseen characters even in complicated scenes. • We develop a real-time FVV system that achieves high-resolution rendering of characters in the scene without any geometry supervision.

MrNeRF

25,699 Aufrufe • vor 1 Jahr

Happy to introduce Habitat-GS, a non-intrusive extension of Habitat-Sim that brings dynamic Gaussian Splatting for photorealistic rendering and comes with hundreds of high-quality 3DGS scene assets, aiming to empowering navigation research. Code:

Happy to introduce Habitat-GS, a non-intrusive extension of Habitat-Sim that brings dynamic Gaussian Splatting for photorealistic rendering and comes with hundreds of high-quality 3DGS scene assets, aiming to empowering navigation research. Code:

Sida Peng

11,997 Aufrufe • vor 1 Monat

HoGS: Unified Near and Far Object Reconstruction via Homogeneous Gaussian Splatting Contributions: First, we propose Homogeneous Gaussian Splatting (HoGS), a novel method adopting homogeneous coordinates to represent positions and scales of 3DGS for realistic and real-time rendering of both near and far objects. Second, despite the ultimate simplicity of HoGS, our method achieves state-of-the-art NVS results compared to other implicit and explicit representations.

HoGS: Unified Near and Far Object Reconstruction via Homogeneous Gaussian Splatting Contributions: First, we propose Homogeneous Gaussian Splatting (HoGS), a novel method adopting homogeneous coordinates to represent positions and scales of 3DGS for realistic and real-time rendering of both near and far objects. Second, despite the ultimate simplicity of HoGS, our method achieves state-of-the-art NVS results compared to other implicit and explicit representations.

MrNeRF

22,978 Aufrufe • vor 1 Jahr

LangSplatV2: High-dimensional 3D Language Gaussian Splatting with 450+ FPS Contributions: • LangSplatV2 achieves real-time performance with 476.2 FPS for high-dimensional feature splatting and 384.6 FPS for 3D open-vocabulary text querying. • Delivers a 42× speedup and 47× boost in performance compared to LangSplat. • Improves query accuracy while drastically reducing inference time. • Replaces the heavyweight decoder in LangSplat with a sparse coefficient field, removing the main performance bottleneck. • Introduces a CUDA-optimized sparse coefficient splatting method, enabling fast and high-quality rendering of high-dimensional features. • Enables scalable 3D language interaction in complex scenes, opening up real-time applications previously not possible with LangSplat.

LangSplatV2: High-dimensional 3D Language Gaussian Splatting with 450+ FPS Contributions: • LangSplatV2 achieves real-time performance with 476.2 FPS for high-dimensional feature splatting and 384.6 FPS for 3D open-vocabulary text querying. • Delivers a 42× speedup and 47× boost in performance compared to LangSplat. • Improves query accuracy while drastically reducing inference time. • Replaces the heavyweight decoder in LangSplat with a sparse coefficient field, removing the main performance bottleneck. • Introduces a CUDA-optimized sparse coefficient splatting method, enabling fast and high-quality rendering of high-dimensional features. • Enables scalable 3D language interaction in complex scenes, opening up real-time applications previously not possible with LangSplat.

MrNeRF

14,261 Aufrufe • vor 11 Monaten

4DGT: Learning a 4D Gaussian Transformer Using Real-World Monocular Videos Abstract: We propose 4DGT, a 4D Gaussian-based Transformer model for dynamic scene reconstruction, trained entirely on real-world monocular posed videos. Using 4D Gaussian as an inductive bias, 4DGT unifies static and dynamic components, enabling the modeling of complex, time-varying environments with varying object lifespans. We introduced a novel density control strategy in training, which allows our 4DGT to handle longer space-time input while maintaining efficient rendering at runtime. Our model processes 64 consecutive posed frames in a rolling-window fashion, predicting consistent 4D Gaussians in the scene. Unlike optimization-based methods, 4DGT performs purely feed-forward inference, reducing reconstruction time from hours to seconds and scaling effectively to long video sequences. Trained only on large-scale monocular posed video datasets, 4DGT can significantly outperform prior Gaussian-based networks in real-world videos and achieve on-par accuracy with optimization-based methods on cross-domain videos.

4DGT: Learning a 4D Gaussian Transformer Using Real-World Monocular Videos Abstract: We propose 4DGT, a 4D Gaussian-based Transformer model for dynamic scene reconstruction, trained entirely on real-world monocular posed videos. Using 4D Gaussian as an inductive bias, 4DGT unifies static and dynamic components, enabling the modeling of complex, time-varying environments with varying object lifespans. We introduced a novel density control strategy in training, which allows our 4DGT to handle longer space-time input while maintaining efficient rendering at runtime. Our model processes 64 consecutive posed frames in a rolling-window fashion, predicting consistent 4D Gaussians in the scene. Unlike optimization-based methods, 4DGT performs purely feed-forward inference, reducing reconstruction time from hours to seconds and scaling effectively to long video sequences. Trained only on large-scale monocular posed video datasets, 4DGT can significantly outperform prior Gaussian-based networks in real-world videos and achieve on-par accuracy with optimization-based methods on cross-domain videos.

MrNeRF

34,782 Aufrufe • vor 1 Jahr

[SIGGRAPH 2025] Photoreal Scene Reconstruction from an Egocentric Device Contributions: 1. We address the importance of employing visual-inertial bundle adjustment (VIBA) that accounts for the rolling-shutter behavior of the RGB camera. This provides a continuous camera trajectory to model pixel movement in neural reconstruction. Our experiments demonstrate that using VIBA consistently improves the novel view quality in Gaussian Splatting by +1 dB in PSNR. 2. We introduce a rasterization-based image formulation pipeline that addresses common artifacts in physical image formation, including rolling shutter, lens shading, exposure, and gain compensation. Our approach is distinct in that we represent image poses as posed pixel arrays sampled from a continuous trajectory, rather than assigning a single camera pose per image, and preserve the merit of Gaussian rasterization. Unlike existing methods that require ray-tracing Gaussians, e.g., [Moenne-Loccoz et al. 2024], our formulation is applicable to general-purpose rasterization-based Gaussian splatting. When applied to 3D Gaussian Splatting (3DGS) [Kerbl et al. 2023], our approach can further enhance reconstruction quality by +1 dB. We outperform existing baselines and demonstrate a substantial quality improvement in handling complex scenes observed by egocentric devices. 3. To reduce the effect of blur from rapid head motion in darker indoor scenes, we propose a strategy of deliberately underexposing input videos during capture, inspired by HDR+ [Hasinoff et al. 2016]. We demonstrate that we can reconstruct high-quality, noise-free scene radiance from noisy, dim input videos, and further render sharp, blur-free videos at a higher dynamic range.

[SIGGRAPH 2025] Photoreal Scene Reconstruction from an Egocentric Device Contributions: 1. We address the importance of employing visual-inertial bundle adjustment (VIBA) that accounts for the rolling-shutter behavior of the RGB camera. This provides a continuous camera trajectory to model pixel movement in neural reconstruction. Our experiments demonstrate that using VIBA consistently improves the novel view quality in Gaussian Splatting by +1 dB in PSNR. 2. We introduce a rasterization-based image formulation pipeline that addresses common artifacts in physical image formation, including rolling shutter, lens shading, exposure, and gain compensation. Our approach is distinct in that we represent image poses as posed pixel arrays sampled from a continuous trajectory, rather than assigning a single camera pose per image, and preserve the merit of Gaussian rasterization. Unlike existing methods that require ray-tracing Gaussians, e.g., [Moenne-Loccoz et al. 2024], our formulation is applicable to general-purpose rasterization-based Gaussian splatting. When applied to 3D Gaussian Splatting (3DGS) [Kerbl et al. 2023], our approach can further enhance reconstruction quality by +1 dB. We outperform existing baselines and demonstrate a substantial quality improvement in handling complex scenes observed by egocentric devices. 3. To reduce the effect of blur from rapid head motion in darker indoor scenes, we propose a strategy of deliberately underexposing input videos during capture, inspired by HDR+ [Hasinoff et al. 2016]. We demonstrate that we can reconstruct high-quality, noise-free scene radiance from noisy, dim input videos, and further render sharp, blur-free videos at a higher dynamic range.

MrNeRF

15,244 Aufrufe • vor 1 Jahr

Google presents RadSplat Radiance Field-Informed Gaussian Splatting for Robust Real-Time Rendering with 900+ FPS Recent advances in view synthesis and real-time rendering have achieved photorealistic quality at impressive rendering speeds. While Radiance Field-based

Google presents RadSplat Radiance Field-Informed Gaussian Splatting for Robust Real-Time Rendering with 900+ FPS Recent advances in view synthesis and real-time rendering have achieved photorealistic quality at impressive rendering speeds. While Radiance Field-based

AK

139,521 Aufrufe • vor 2 Jahren

a one minute explanation of 4d gaussian splatting

a one minute explanation of 4d gaussian splatting

dylan

186,592 Aufrufe • vor 2 Jahren

Nvidia announces GAvatar: Animatable 3D Gaussian Avatars with Implicit Mesh Learning paper page: Gaussian splatting has emerged as a powerful 3D representation that harnesses the advantages of both explicit (mesh) and implicit (NeRF) 3D representations. In this paper, we seek to leverage Gaussian splatting to generate realistic animatable avatars from textual descriptions, addressing the limitations (e.g., flexibility and efficiency) imposed by mesh or NeRF-based representations. However, a naive application of Gaussian splatting cannot generate high-quality animatable avatars and suffers from learning instability; it also cannot capture fine avatar geometries and often leads to degenerate body parts. To tackle these problems, we first propose a primitive-based 3D Gaussian representation where Gaussians are defined inside pose-driven primitives to facilitate animation. Second, to stabilize and amortize the learning of millions of Gaussians, we propose to use neural implicit fields to predict the Gaussian attributes (e.g., colors). Finally, to capture fine avatar geometries and extract detailed meshes, we propose a novel SDF-based implicit mesh learning approach for 3D Gaussians that regularizes the underlying geometries and extracts highly detailed textured meshes. Our proposed method, GAvatar, enables the large-scale generation of diverse animatable avatars using only text prompts. GAvatar significantly surpasses existing methods in terms of both appearance and geometry quality, and achieves extremely fast rendering (100 fps) at 1K resolution.

AK

140,960 Aufrufe • vor 2 Jahren

GS^3: Efficient Relighting with Triple Gaussian Splatting Abstract: We present a spatial and angular Gaussian based representation and a triple splatting process, for real-time, high-quality novel lighting-and-view synthesis from multi-view point-lit input images. To describe complex ap pearance, we employ a Lambertian plus a mixture of angular Gaussians as an effective reflectance function for each spatial Gaussian. To generate self-shadow, we splat all spatial Gaussians towards the light source to obtain shadow values, which are further refined by a small multi-layer perceptron. To compensate for other effects like global illumination, another network is trained to compute and add a per-spatial-Gaussian RGB tuple. The effectiveness of our representation is demonstrated on 30 samples with a wide variation in geometry (from solid to fluffy) and appearance (from translucent to anisotropic), as well as using different forms of input data, including rendered images of synthetic/reconstructed objects, photographs captured with a handheld camera and a flash, or from a professional lightstage. We achieve a training time of 40-70 minutes and a rendering speed of 90 fps on a single commodity GPU. Our results compare favorably with state-of-the-art techniques in terms of quality/performance.

GS^3: Efficient Relighting with Triple Gaussian Splatting Abstract: We present a spatial and angular Gaussian based representation and a triple splatting process, for real-time, high-quality novel lighting-and-view synthesis from multi-view point-lit input images. To describe complex ap pearance, we employ a Lambertian plus a mixture of angular Gaussians as an effective reflectance function for each spatial Gaussian. To generate self-shadow, we splat all spatial Gaussians towards the light source to obtain shadow values, which are further refined by a small multi-layer perceptron. To compensate for other effects like global illumination, another network is trained to compute and add a per-spatial-Gaussian RGB tuple. The effectiveness of our representation is demonstrated on 30 samples with a wide variation in geometry (from solid to fluffy) and appearance (from translucent to anisotropic), as well as using different forms of input data, including rendered images of synthetic/reconstructed objects, photographs captured with a handheld camera and a flash, or from a professional lightstage. We achieve a training time of 40-70 minutes and a rendering speed of 90 fps on a single commodity GPU. Our results compare favorably with state-of-the-art techniques in terms of quality/performance.

MrNeRF

17,759 Aufrufe • vor 1 Jahr

$FAU Erlangen-Nürnberg presents TRIPS Trilinear Point Splatting for Real-Time Radiance Field Rendering paper page: Point-based radiance field rendering has demonstrated impressive results for novel view synthesis, offering a compelling blend of rendering quality and computational efficiency. However, also latest approaches in this domain are not without their shortcomings. 3D Gaussian Splatting [Kerbl and Kopanas et al. 2023] struggles when tasked with rendering highly detailed scenes, due to blurring and cloudy artifacts. On the other hand, ADOP [R\"uckert et al. 2022] can accommodate crisper images, but the neural reconstruction network decreases performance, it grapples with temporal instability and it is unable to effectively address large gaps in the point cloud. In this paper, we present TRIPS (Trilinear Point Splatting), an approach that combines ideas from both Gaussian Splatting and ADOP. The fundamental concept behind our novel technique involves rasterizing points into a screen-space image pyramid, with the selection of the pyramid layer determined by the projected point size. This approach allows rendering arbitrarily large points using a single trilinear write. A lightweight neural network is then used to reconstruct a hole-free image including detail beyond splat resolution. Importantly, our render pipeline is entirely differentiable, allowing for automatic optimization of both point sizes and positions. Our evaluation demonstrate that TRIPS surpasses existing state-of-the-art methods in terms of rendering quality while maintaining a real-time frame rate of 60 frames per second on readily available hardware. This performance extends to challenging scenarios, such as scenes featuring intricate geometry, expansive landscapes, and auto-exposed footage.$

FAU Erlangen-Nürnberg presents TRIPS Trilinear Point Splatting for Real-Time Radiance Field Rendering paper page: Point-based radiance field rendering has demonstrated impressive results for novel view synthesis, offering a compelling blend of rendering quality and computational efficiency. However, also latest approaches in this domain are not without their shortcomings. 3D Gaussian Splatting [Kerbl and Kopanas et al. 2023] struggles when tasked with rendering highly detailed scenes, due to blurring and cloudy artifacts. On the other hand, ADOP [R\"uckert et al. 2022] can accommodate crisper images, but the neural reconstruction network decreases performance, it grapples with temporal instability and it is unable to effectively address large gaps in the point cloud. In this paper, we present TRIPS (Trilinear Point Splatting), an approach that combines ideas from both Gaussian Splatting and ADOP. The fundamental concept behind our novel technique involves rasterizing points into a screen-space image pyramid, with the selection of the pyramid layer determined by the projected point size. This approach allows rendering arbitrarily large points using a single trilinear write. A lightweight neural network is then used to reconstruct a hole-free image including detail beyond splat resolution. Importantly, our render pipeline is entirely differentiable, allowing for automatic optimization of both point sizes and positions. Our evaluation demonstrate that TRIPS surpasses existing state-of-the-art methods in terms of rendering quality while maintaining a real-time frame rate of 60 frames per second on readily available hardware. This performance extends to challenging scenarios, such as scenes featuring intricate geometry, expansive landscapes, and auto-exposed footage.

AK

45,459 Aufrufe • vor 2 Jahren

[NeurIPS '24] DreamMesh4D: Video-to-4D Generation with Sparse-Controlled Gaussian-Mesh Hybrid Representation Abstract (excerpt) We introduce DreamMesh4D, a novel framework that combines mesh representation with sparse-controlled deformation technique to generate high-quality 4D object from a monocular video. To overcome the limitation of classical texture representation, we bind Gaussian splats to the surface of the triangular mesh for differentiable optimization of both the texture and mesh vertices. In particular, DreamMesh4D begins with a coarse mesh provided by a single image based 3D generation method. Sparse points are then uniformly sampled across the surface of the mesh, and are used to build a deformation graph to drive the motion of the 3D object for the sake of computational efficiency and providing additional constraint. For each step, transformations of sparse control points are predicted using a deformation network, and the mesh vertices as well as the bound surface Gaussians are deformed via a geometric skinning algorithm. The skinning algorithm is a hybrid approach combining LBS (linear blending skinning) and DQS (dual-quaternion skinning), mitigating drawbacks associated with both approaches. The static surface Gaussians and mesh vertices as well as the dynamic deformation network are learned via reference view photometric loss, score distillation loss as well as other regularization losses in a two-stage manner. Extensive experiments demonstrate that our method outperforms prior video-to-4D generation methods in terms of rendering quality and spatial-temporal consistency.

[NeurIPS '24] DreamMesh4D: Video-to-4D Generation with Sparse-Controlled Gaussian-Mesh Hybrid Representation Abstract (excerpt) We introduce DreamMesh4D, a novel framework that combines mesh representation with sparse-controlled deformation technique to generate high-quality 4D object from a monocular video. To overcome the limitation of classical texture representation, we bind Gaussian splats to the surface of the triangular mesh for differentiable optimization of both the texture and mesh vertices. In particular, DreamMesh4D begins with a coarse mesh provided by a single image based 3D generation method. Sparse points are then uniformly sampled across the surface of the mesh, and are used to build a deformation graph to drive the motion of the 3D object for the sake of computational efficiency and providing additional constraint. For each step, transformations of sparse control points are predicted using a deformation network, and the mesh vertices as well as the bound surface Gaussians are deformed via a geometric skinning algorithm. The skinning algorithm is a hybrid approach combining LBS (linear blending skinning) and DQS (dual-quaternion skinning), mitigating drawbacks associated with both approaches. The static surface Gaussians and mesh vertices as well as the dynamic deformation network are learned via reference view photometric loss, score distillation loss as well as other regularization losses in a two-stage manner. Extensive experiments demonstrate that our method outperforms prior video-to-4D generation methods in terms of rendering quality and spatial-temporal consistency.

MrNeRF

12,323 Aufrufe • vor 1 Jahr

Enter the SPLATRIX! We digitized Henry Pearce and converted him over to 4DGS to be played back in real-time. 3D Gaussians Splats in Motion. 4DGS. Our goal was to improve the quality of our AeonX & 4DGS pipeline, with automated background removal to retain fine hair details as well as to improve overall surface consistency on a full torso capture. We're currently using a post process to temporally filter the 2D frames. We still need to solve 4DGS temporal stability. We're hoping to test with Jonathon Luiten dynamic gaussian code soon... 🙏 for that script! We captured this performance under global illumination. This is an easy task compared to RGB+W lighting, which is going to be much harder! This is our next challenge. We're posting these snippet videos to show our progress with 3D/4DGS. We plan to write a full blog post on our 3D/4DGS journey soon. 3D Gaussian Splatting #GaussianSplatting #inria #radiancefields #ir #aeonx #rgbw #idatronic #ximea #sibr #hdri #60fps More info here - Higher Quality Videos:

Enter the SPLATRIX! We digitized Henry Pearce and converted him over to 4DGS to be played back in real-time. 3D Gaussians Splats in Motion. 4DGS. Our goal was to improve the quality of our AeonX & 4DGS pipeline, with automated background removal to retain fine hair details as well as to improve overall surface consistency on a full torso capture. We're currently using a post process to temporally filter the 2D frames. We still need to solve 4DGS temporal stability. We're hoping to test with Jonathon Luiten dynamic gaussian code soon... 🙏 for that script! We captured this performance under global illumination. This is an easy task compared to RGB+W lighting, which is going to be much harder! This is our next challenge. We're posting these snippet videos to show our progress with 3D/4DGS. We plan to write a full blog post on our 3D/4DGS journey soon. 3D Gaussian Splatting #GaussianSplatting #inria #radiancefields #ir #aeonx #rgbw #idatronic #ximea #sibr #hdri #60fps More info here - Higher Quality Videos:

Infinite-Realities

90,625 Aufrufe • vor 2 Jahren