
MrNeRF
@janusch_patas • 16,684 subscribers
Founder and CEO of https://t.co/5MjtfpwEU3 | Your guide to radiance fields | Host of the podcast @ViewDependent | FTP: 279 | discord: https://t.co/lrl64WGvlD
Shorts
Videos

I'm excited to share the Geo Register Plugin for LichtFeld Studio from the LichtFeld community! This plugin helps bring Gaussian splat scenes into real-world geographic space. It registers a scene to WGS-84 and ECEF coordinates, so you can click any point on the model and get its latitude, longitude and altitude. It supports multiple georeferencing sources, including EXIF GPS data, image position CSVs, RealityScan camera parameters and saved similarity transforms. Once the scene is registered, you can export geo-referenced splat models as LAS, LAZ or 3D Tiles datasets for use in GIS and 3D mapping workflows. Built for anyone working with drone data, photogrammetry, Gaussian splatting, GIS, ArcGIS or CesiumJS. Link in the comment below!
MrNeRF50,680 просмотров • 22 дней назад

RT-Splatting: Joint Reflection-Transmission Modeling with Gaussian Splatting Contributions: • We introduce a unified surface-volume Gaussian scene representation for jointly modeling sharp specular reflections and clear transmission in real-world scenes containing thin semi-transparent surfaces. • We propose Specular-Aware Gradient Gating to suppress misleading gradients from complex specular regions, substantially reducing floaters in the transmission branch. • Extensive experiments demonstrate that RT-Splatting significantly outperforms prior methods while maintaining real-time rendering and enabling flexible scene editing.
MrNeRF27,322 просмотров • 16 дней назад

Europe Builds. Others Profit. 3D Gaussian Splatting (3DGS) is the perfect case study. It reflects both Europe’s brilliance and its chronic inability to turn that brilliance into business. Almost everything that made 3DGS possible was born in Europe. From the early breakthroughs in point-based rasterization in Switzerland to the cumulative research from Austria, Greece, and Germany executed in France, Europe built the foundation. No other continent can match that level of scientific collaboration and intellectual strength. The LichtFeld Studio bounty later confirmed it: the biggest performance leaps came straight out of European labs. The science was here. The innovation was here. The talent was here. But the business was not. When 3DGS exploded, my inbox filled with messages from US-based companies, not from Europe. In the United States, Luma AI and Polycam turned the paper into products within weeks. They did not wait for funding programs or EU consortia. They simply built. Then came China, which not only caught up in research but quickly outpaced everyone in commercialization. XGRID, DJI, and many others built thriving businesses around what Europe invented. Today, most 3DGS papers come from Chinese institutions rather than European ones. Meanwhile, the usual giants such as Meta, NVIDIA, Google, Netflix, and Tesla continue to iterate, integrate, and push forward. A thriving ecosystem of startups like World Labs leverages this technology to create new products and markets. The innovation cycle in the United States and China is fast, relentless, and market-driven. Europe, in contrast, remains bureaucratic and slow. We fund excellence and celebrate publications, but we rarely ship, even though some small startups are trying to change the status quo. Our researchers create the breakthroughs; others create the successful products. Until Europe finds a way to bridge the gap between laboratories and markets, it will remain the world’s research and development department: brilliant, underpaid, and underleveraged. Research is Europe’s comfort zone. Execution must become its strength. Video: One of my dynamic 3D Gaussian implementations based on the paper "Representing Long Volumetric Video with Temporal Gaussian Hierarchy."
MrNeRF159,100 просмотров • 7 месяцев назад

MAGS-SLAM: Monocular Multi-Agent Gaussian Splatting SLAM for Geometrically and Photometrically Consistent Reconstruction TL;DR: The first RGB-only multi-agent 3D Gaussian Splatting SLAM for collaborative photorealistic scene reconstruction. Contributions: (1) We propose the first monocular RGB-only multi-agent 3D Gaussian Splatting SLAM system. It integrates Gaussian front-ends, compact submap summaries, inter-agent verification, Sim(3) submap pose graph, and occupancy-aware fusion into a unified framework, achieving accurate tracking and photorealistic reconstruction without depth sensors. (2) We propose a Pose-Graph Bundle Adjustment (PGBA)-consistent Sim(3) loop closure mechanism for multi-agent systems, which jointly resolves intra- and inter-agent scale drift through a submap-level Sim(3) pose graph coupling geometric and photometric residuals. Robustness is ensured by a spatial-extent gate that rejects degenerate loops and an adaptive edge invalidation scheme consistent with evolving PGBA corrections. (3) We propose an occupancy-aware fusion framework for coherent multi-agent Gaussian maps. It combines occupancy-grid deduplication, decoupled coordinator, and joint pose-Gaussian photometric refinement to eliminate duplicated Gaussians, residual misalignment, and photometric seams across agents. (4) We introduce ReplicaMultiagent Plus dataset. While existing multi-agent datasets are typically limited to 2-3 agents with short trajectories, our dataset scales to 4 agents with long-horizon trajectories. In addition, we provide ground-truth geometry and semantic annotations, supporting the evaluation of monocular, RGB-D, and semantic multi-agent SLAM for collaborative dense reconstruction.
MrNeRF19,072 просмотров • 23 дней назад

Geometric Context Transformer for Streaming 3D Reconstruction Contributions: • We introduce LingBot-Map, a streaming 3D foundation model built around Geometric Context Attention (GCA), which maintains three complementary context types – anchor, pose-reference window, and trajectory memory – for efficient and consistent long-sequence streaming inference. • We propose an efficient training recipe based on progressive training and context parallelism with a relative loss formulation for stable long-sequence optimization. • We demonstrate that LingBot-Map achieves state-of-the-art performance on multiple benchmarks (Oxford Spires, Tanks and Temples, ETH3D, and 7-Scenes), significantly outperforming existing streaming approaches in reconstruction quality and inference speed.
MrNeRF24,456 просмотров • 1 месяц назад

[SIGGRAPH '26] Anchored Temporal Gaussian Splatting for Long Volumetric Video Representation TL;DR: We present ATGS, a novel framework for volumetric video reconstruction that effectively handles long sequences and complex motions. By utilizing time-conditioned anchors and a temporal windowing strategy, ATGS enhances temporal coherence and scalability. Abstract (excerpt): Key insight is that explicitly tracking long term complex motion with individual Gaussian primitives is inherently unstable. Instead, we organize Gaussians around time conditioned anchors that localize their spatial and temporal support, thereby reducing long range motion complexity. We further introduce a temporal windowing strategy to activate only anchors relevant to the queried time, which improves scalability and temporal coherence. In addition, to ensure spatial and temporal stability, we design a compact set of multi level anchor features that encode global features, local spatial features, and local temporal features, jointly constraining Gaussian generation. Extensive experiments demonstrate that ATGS consistently outperforms prior methods on long sequence volumetric videos with complex motions.
MrNeRF26,905 просмотров • 1 месяц назад

Skyfall-GS: Synthesizing Immersive 3D Urban Scenes from Satellite Imagery TL;DR: Skyfall-GS converts satellite images to explorable 3D urban scenes using diffusion models, with real-time rendering performance. Contributions: • We introduce Skyfall-GS, the first method to synthesize immersive, real-time, free-flight navigable 3D urban scenes solely from multi-view satellite imagery using generative refinement. • An open-domain refinement approach leverages pre-trained text-to-image diffusion models without domain-specific training. • A curriculum-learning-based iterative refinement strategy progressively enhances reconstruction quality from higher to lower viewpoints, significantly improving visual fidelity in occluded areas.
MrNeRF66,058 просмотров • 7 месяцев назад

Instant Skinned Gaussian Avatars for Web, Mobile and VR Applications Short summary: In our system, we animate a background 3D mesh and have the Gaussian splats follow the mesh’s vertices. During preprocessing, splats are assigned to mesh vertices, and their relative transformations are stored. Once this data is saved, you can instantly use it in your applications without further preprocessing. At runtime, we animate the background 3D mesh, update the Gaussian splats in parallel, and resort all Gaussian splats every frame based on the viewer’s perspective.
MrNeRF66,016 просмотров • 7 месяцев назад

Here is some new footage from this paper, offering a glimpse into the future of dynamic 3D Gaussian Splatting models combined with static reconstructed scenes. Imagine this: when the lighting matches, the result becomes practically indistinguishable from reality. Just pick a scene, add characters, and record it from any angle. Apply diffusion models to instantly change the look. I firmly believe this is the future of VFX.
MrNeRF57,806 просмотров • 6 месяцев назад

Huge update: LichtFeld Studio v0.5.0 🚀 What’s new: • Embedded Python runtime + plugin system makes LFS fully hackable and extensible (isolated uv environments, hot reload) • Integrated plugin marketplace (6 plugins incl. Sharp4D, densification++) • MCP protocol integration (full parity with the user interaction layer) • Mesh rendering + OpenMesh (Python) + Mesh2Splat • ImprovedGS+ (arxiv:2603.08661) • RmlUI-based GUI (HTML and CSS style workflows) • Undo/Redo with plugin integration, Sequencer, PPiSP Huge thanks to our corporate sponsor Core11 GmbH and to all contributors 🙏Enjoy! If you find it useful, consider supporting the project by donating to keep it evolving. Next up: better training quality and smoother editing workflows
MrNeRF19,277 просмотров • 2 месяцев назад

[SIGGRAPH ASIA '25] Detail-Enhanced Gaussian Splatting for Large-Scale Volumetric Capture Contributions: - A two-stage approach to performance capture, combining a scene-scale capture rig and a single-actor facial capture rig. - A novel high-quality scene-scale volumetric performance capture rig, incorporating both static and dynamic cameras to track the performance of multiple actors. - A reconstruction pipeline for dynamic performance capture, featuring stable calibration of moving cameras and 4DGS with improved dynamic range and color fidelity. - A detail enhancement Diffusion Model, which supports 4K, RGB, and Alpha, with improved temporal stability.
MrNeRF42,317 просмотров • 7 месяцев назад

[SIGGRAPH Asia '24 (TOG)] Representing Long Volumetric Video with Temporal Gaussian Hierarchy Contributions: • We introduce a novel, efficient, and expressive Temporal Gaussian Hierarchy representation for long volumetric video. To our knowledge, our method is the first approach capable of handling minutes of volumetric video data. • We propose a Compact Appearance Model and a new rasterization implementation to facilitate real-time, high-quality dynamic view synthesis while maintaining a compact size. • We propose a system to efficiently model long volumetric videos for the first time and demonstrate state-of-the-art dynamic view synthesis quality on the Neural3DV [Li et al. 2022], ENeRF-Outdoor [Lin et al. 2022], and MobileStage [Xu et al. 2024b] datasets, while also achieving the best rendering speed with reduced training cost and memory usage.
MrNeRF79,249 просмотров • 1 год назад

Forget about #Sora. DUSt3R is the real deal. I took two pictures of our kitchen that barely overlap. It took << 2sec on a RTX 4090 to reconstruct it in an insane quality. Can we get out a point cloud for Gaussian Splatting #3DGS training + the camera poses?
MrNeRF103,255 просмотров • 2 лет назад

ViPE: Video Pose Engine for 3D Geometric Perception Contributions: • A robust and efficient framework, ViPE, for estimating camera parameters and dense depth from diverse, in-the-wild videos. • A system design that integrates the strengths of classical SLAM (efficiency, scalability) and learned models (robustness), with key improvements in efficiency, dynamic object handling, and depth quality over prior work. • A large-scale dataset of annotated videos, created using ViPE, to facilitate future research in 3D computer vision.
MrNeRF42,521 просмотров • 9 месяцев назад

LongSplat: Robust Unposed 3D Gaussian Splatting for Casual Long Videos Contributions: • An incremental joint optimization approach for simultaneous camera pose and 3DGS reconstruction, reducing local minima and ensuring global consistency. • A robust pose estimation module leveraging learned 3D priors for accurate camera pose estimation. • An adaptive Octree Anchor Formation strategy that significantly reduces memory usage while preserving reconstruction quality.
MrNeRF35,969 просмотров • 9 месяцев назад

Spatio-Temporal Reconstruction Model for Large-Scale Outdoor Scenes Contributions: • We propose STORM, the first feed-forward, self-supervised method for fast and accurate reconstruction of dynamic 3D scenes from sparse, multi-timestep, posed camera images. • Our bottom-up framework aggregates and transforms per-frame 3D Gaussian Splats into a cohesive scene representation, enabling self-supervised motion estimation. Furthermore, we introduce motion tokens that capture common motion primitives and regularize motion predictions, facilitating dynamic motion group segmentation without explicit motion or correspondence supervision. • We present several enhancements for in-the-wild scenarios, including sky modeling, camera exposure inconsistency handling, large novel-view extrapolation, and fine-grained human motions reconstruction, making STORM well-suited for real-world applications.
MrNeRF53,292 просмотров • 1 год назад

PPISP will arrive in the upcoming nightly build of LichtFeld Studio! It is now fully supported including the controller. LichtFeld is the only software that also provides GUI integration. Give it a shot in the next build. I tried to make it as memory efficient as possible and it only adds a small VRAM penalty. However, the controller comes with some downsides: It is an extra file to not pollute the ply The appearance adjustments cannot be baked into the splat, i.e. you will need LFS for now Hard to imagine that any popular web viewer will support it anytime soon However, very soonish there will be a sequencer which you can use to create videos. As LichtFeld Studio continues to grow and expand, that's not going to be a real downside anyway :)
MrNeRF17,075 просмотров • 4 месяцев назад