正在加载视频...

视频加载失败

加载此视频时出现问题。这可能是由于临时网络问题，或视频可能不可用。

DroneSplat: 3D Gaussian Splatting for Robust 3D Reconstruction from In-the-Wild Drone Imagery Abstract: Drones have become essential tools for reconstructing wild scenes due to their outstanding maneuverability. Recent advances in radiance field methods have achieved remarkable rendering quality, providing a new avenue for 3D reconstruction from drone imagery. However,... dynamic distractors in wild environments challenge the static scene assumption in radiance fields, while limited view constraints hinder the accurate capture of underlying scene geometry. To address these challenges, we introduce DroneSplat, a novel framework designed for robust 3D reconstruction from in-the-wild drone imagery. Our method adaptively adjusts masking thresholds by integrating local-global segmentation heuristics with statistical approaches, enabling precise identification and elimination of dynamic distractors in static scenes. We enhance 3D Gaussian Splatting with multi-view stereo predictions and a voxel-guided optimization strategy, supporting high-quality rendering under limited view constraints. For comprehensive evaluation, we provide a drone-captured 3D reconstruction dataset encompassing both dynamic and static scenes. Extensive experiments demonstrate that DroneSplat outperforms both 3DGS and NeRF baselines in handling in-the-wild drone imagery.show more

MrNeRF

16,779 subscribers

21,346 次观看 • 1 年前 •via X (Twitter)

Anya Rossi• Live Now

Private livecam show

0 条评论

暂无评论

原始帖子的评论将显示在这里

相关视频

GSTAR: Gaussian Surface Tracking and Reconstruction Contributions: • A new framework for tracking and reconstructing dynamic scenes, combining 3D Gaussians and meshes to effectively manage changes in topology. • A method for Gaussian unbinding and surface re-meshing, allowing for the generation of new surfaces as topologies evolve. • A method for handling large or fast deformations of surfaces between frames using scene flow warping. Abstract (excerpt): However, tracking dynamic surfaces with 3D Gaussians remains challenging due to complex topology changes, such as surfaces appearing, disappearing, or splitting. To address these challenges, we propose GSTAR, a novel method that achieves photo-realistic rendering, accurate surface reconstruction, and reliable 3D tracking for general dynamic scenes with changing topology. Given multi-view captures as input, GSTAR binds Gaussians to mesh faces to represent dynamic objects. For surfaces with consistent topology, GSTAR maintains the mesh topology and tracks the meshes using Gaussians.

GSTAR: Gaussian Surface Tracking and Reconstruction Contributions: • A new framework for tracking and reconstructing dynamic scenes, combining 3D Gaussians and meshes to effectively manage changes in topology. • A method for Gaussian unbinding and surface re-meshing, allowing for the generation of new surfaces as topologies evolve. • A method for handling large or fast deformations of surfaces between frames using scene flow warping. Abstract (excerpt): However, tracking dynamic surfaces with 3D Gaussians remains challenging due to complex topology changes, such as surfaces appearing, disappearing, or splitting. To address these challenges, we propose GSTAR, a novel method that achieves photo-realistic rendering, accurate surface reconstruction, and reliable 3D tracking for general dynamic scenes with changing topology. Given multi-view captures as input, GSTAR binds Gaussians to mesh faces to represent dynamic objects. For surfaces with consistent topology, GSTAR maintains the mesh topology and tracks the meshes using Gaussians.

MrNeRF

22,698 次观看 • 1 年前

3D Gaussian Splatting for Real-Time Radiance Field Rendering paper page: Radiance Field methods have recently revolutionized novel-view synthesis of scenes captured with multiple photos or videos. However, achieving high visual quality still requires neural networks that are costly to train and render, while recent faster methods inevitably trade off speed for quality. For unbounded and complete scenes (rather than isolated objects) and 1080p resolution rendering, no current method can achieve real-time display rates. We introduce three key elements that allow us to achieve state-of-the-art visual quality while maintaining competitive training times and importantly allow high-quality real-time (>= 30 fps) novel-view synthesis at 1080p resolution. First, starting from sparse points produced during camera calibration, we represent the scene with 3D Gaussians that preserve desirable properties of continuous volumetric radiance fields for scene optimization while avoiding unnecessary computation in empty space; Second, we perform interleaved optimization/density control of the 3D Gaussians, notably optimizing anisotropic covariance to achieve an accurate representation of the scene; Third, we develop a fast visibility-aware rendering algorithm that supports anisotropic splatting and both accelerates training and allows realtime rendering. We demonstrate state-of-the-art visual quality and real-time rendering on several established datasets.

3D Gaussian Splatting for Real-Time Radiance Field Rendering paper page: Radiance Field methods have recently revolutionized novel-view synthesis of scenes captured with multiple photos or videos. However, achieving high visual quality still requires neural networks that are costly to train and render, while recent faster methods inevitably trade off speed for quality. For unbounded and complete scenes (rather than isolated objects) and 1080p resolution rendering, no current method can achieve real-time display rates. We introduce three key elements that allow us to achieve state-of-the-art visual quality while maintaining competitive training times and importantly allow high-quality real-time (>= 30 fps) novel-view synthesis at 1080p resolution. First, starting from sparse points produced during camera calibration, we represent the scene with 3D Gaussians that preserve desirable properties of continuous volumetric radiance fields for scene optimization while avoiding unnecessary computation in empty space; Second, we perform interleaved optimization/density control of the 3D Gaussians, notably optimizing anisotropic covariance to achieve an accurate representation of the scene; Third, we develop a fast visibility-aware rendering algorithm that supports anisotropic splatting and both accelerates training and allows realtime rendering. We demonstrate state-of-the-art visual quality and real-time rendering on several established datasets.

AK

633,428 次观看 • 2 年前

Wonderland: Navigating 3D Scenes from a Single Image Contributions: • First, we introduce a representation for controllable 3D generation by leveraging the generative priors from camera-guided video diffusion models. Unlike image models, video diffusion models are trained on extensive video datasets. This enables them to capture comprehensive spatial relationships within scenes across multiple views and embed a form of "3D awareness" in their latent space, which allows us to maintain 3D consistency in novel view synthesis. • Second, to achieve controllable novel view generation, we empower video models with precise control over specified camera motions. We introduce a novel dual-branch conditioning mechanism that effectively incorporates desired diverse camera trajectories into the video diffusion model. This enables expansion of a single image into a multi-view consistent capture of a 3D scene with precise pose control. • Third, to achieve efficient 3D reconstruction, we directly transform video latents into 3DGS. We propose a novel latent-based large reconstruction model (LaLRM) that lifts video latents to 3D in a feed-forward manner. With this design, during inference, our model directly predicts 3DGS from a single input image, effectively aligning the generation and reconstruction tasks—and bridging image space and 3D space—through the video latent space. Compared with reconstructing scenes from images, the video latent space offers a 256× spatial-temporal reduction while retaining essential and consistent 3D structural details. Such a high degree of compression is crucial, as it allows the LaLRM to handle a wider range of 3D scenes within the reconstruction framework, with the same memory constraints.

Wonderland: Navigating 3D Scenes from a Single Image Contributions: • First, we introduce a representation for controllable 3D generation by leveraging the generative priors from camera-guided video diffusion models. Unlike image models, video diffusion models are trained on extensive video datasets. This enables them to capture comprehensive spatial relationships within scenes across multiple views and embed a form of "3D awareness" in their latent space, which allows us to maintain 3D consistency in novel view synthesis. • Second, to achieve controllable novel view generation, we empower video models with precise control over specified camera motions. We introduce a novel dual-branch conditioning mechanism that effectively incorporates desired diverse camera trajectories into the video diffusion model. This enables expansion of a single image into a multi-view consistent capture of a 3D scene with precise pose control. • Third, to achieve efficient 3D reconstruction, we directly transform video latents into 3DGS. We propose a novel latent-based large reconstruction model (LaLRM) that lifts video latents to 3D in a feed-forward manner. With this design, during inference, our model directly predicts 3DGS from a single input image, effectively aligning the generation and reconstruction tasks—and bridging image space and 3D space—through the video latent space. Compared with reconstructing scenes from images, the video latent space offers a 256× spatial-temporal reduction while retaining essential and consistent 3D structural details. Such a high degree of compression is crucial, as it allows the LaLRM to handle a wider range of 3D scenes within the reconstruction framework, with the same memory constraints.

MrNeRF

52,801 次观看 • 1 年前

Adaptive and Temporally Consistent Gaussian Surfels for Multi-view Dynamic Reconstruction Contributions: • A method for efficiently reconstructing dynamic surfaces from multi-view videos using Gaussian surfels. • A unified and gradient-aware densification strategy for optimizing dynamic 3D Gaussians with fine details. • A temporal consistency approach that ensures stable and coherent surface reconstructions across frames by enforcing consistency on curvature maps. • Extensive experiments that demonstrate our method’s advantages including fast training, high-fidelity novel view synthesis, and accurate surface geometry.

Adaptive and Temporally Consistent Gaussian Surfels for Multi-view Dynamic Reconstruction Contributions: • A method for efficiently reconstructing dynamic surfaces from multi-view videos using Gaussian surfels. • A unified and gradient-aware densification strategy for optimizing dynamic 3D Gaussians with fine details. • A temporal consistency approach that ensures stable and coherent surface reconstructions across frames by enforcing consistency on curvature maps. • Extensive experiments that demonstrate our method’s advantages including fast training, high-fidelity novel view synthesis, and accurate surface geometry.

MrNeRF

31,821 次观看 • 1 年前

Self-Calibrating Gaussian Splatting for Large Field of View Reconstruction Note: Check below for full video. Abstract (cited): "In this paper, we present a self-calibrating framework that jointly optimizes camera parameters, lens distortion, and 3D Gaussian representations, enabling accurate and efficient scene reconstruction. Our technique is particularly effective for high-quality scene reconstruction from large field-of-view (FOV) imagery taken with wide-angle lenses, allowing the scene to be modeled from a smaller number of images. We introduce a novel method for modeling complex lens distortions using a hybrid network that combines invertible residual networks with explicit grids. This design effectively regularizes the optimization process, achieving greater accuracy than conventional camera models. Additionally, we propose a cubemap-based resampling strategy to support large FOV images without sacrificing resolution or introducing distortion artifacts. Our method is compatible with the fast rasterization of Gaussian Splatting, adaptable to a wide variety of camera lens distortions, and demonstrates state-of-the-art performance on both synthetic and real-world datasets."

Self-Calibrating Gaussian Splatting for Large Field of View Reconstruction Note: Check below for full video. Abstract (cited): "In this paper, we present a self-calibrating framework that jointly optimizes camera parameters, lens distortion, and 3D Gaussian representations, enabling accurate and efficient scene reconstruction. Our technique is particularly effective for high-quality scene reconstruction from large field-of-view (FOV) imagery taken with wide-angle lenses, allowing the scene to be modeled from a smaller number of images. We introduce a novel method for modeling complex lens distortions using a hybrid network that combines invertible residual networks with explicit grids. This design effectively regularizes the optimization process, achieving greater accuracy than conventional camera models. Additionally, we propose a cubemap-based resampling strategy to support large FOV images without sacrificing resolution or introducing distortion artifacts. Our method is compatible with the fast rasterization of Gaussian Splatting, adaptable to a wide variety of camera lens distortions, and demonstrates state-of-the-art performance on both synthetic and real-world datasets."

MrNeRF

17,206 次观看 • 1 年前

We will be in ACM SIGGRAPH 2023 with "3D Gaussian Splatting for Real-Time Radiance Field Rendering", have you ever seen radiance fields with 100+ FPS and MipNeRF360 quality? Check out our website here:

We will be in ACM SIGGRAPH 2023 with "3D Gaussian Splatting for Real-Time Radiance Field Rendering", have you ever seen radiance fields with 100+ FPS and MipNeRF360 quality? Check out our website here:

George Kopanas

135,782 次观看 • 3 年前

Human Hair Reconstruction with Strand-Aligned 3D Gaussians Contributions (cited): – We propose a new 3D line lifting scheme that uses a modified 3DGS reconstruction technique to lift 2D orientation maps into a 3D field while also providing refinement of the camera parameters; – We introduce a dual representation of hair strand polylines and 3D Gaussians to achieve differentiable rasterization of hair strands and leverage photometric constraints for strand-based hair reconstruction; – Based on these components, we propose a coarse-to-fine optimization method for prior-guided hair reconstruction that leverages both latent and explicit representations of the hairstyle.

Human Hair Reconstruction with Strand-Aligned 3D Gaussians Contributions (cited): – We propose a new 3D line lifting scheme that uses a modified 3DGS reconstruction technique to lift 2D orientation maps into a 3D field while also providing refinement of the camera parameters; – We introduce a dual representation of hair strand polylines and 3D Gaussians to achieve differentiable rasterization of hair strands and leverage photometric constraints for strand-based hair reconstruction; – Based on these components, we propose a coarse-to-fine optimization method for prior-guided hair reconstruction that leverages both latent and explicit representations of the hairstyle.

MrNeRF

106,497 次观看 • 1 年前

Segment Any 3D Gaussians paper page: Interactive 3D segmentation in radiance fields is an appealing task since its importance in 3D scene understanding and manipulation. However, existing methods face challenges in either achieving fine-grained, multi-granularity segmentation or contending with substantial computational overhead, inhibiting real-time interaction. In this paper, we introduce Segment Any 3D GAussians (SAGA), a novel 3D interactive segmentation approach that seamlessly blends a 2D segmentation foundation model with 3D Gaussian Splatting (3DGS), a recent breakthrough of radiance fields. SAGA efficiently embeds multi-granularity 2D segmentation results generated by the segmentation foundation model into 3D Gaussian point features through well-designed contrastive training. Evaluation on existing benchmarks demonstrates that SAGA can achieve competitive performance with state-of-the-art methods. Moreover, SAGA achieves multi-granularity segmentation and accommodates various prompts, including points, scribbles, and 2D masks. Notably, SAGA can finish the 3D segmentation within milliseconds, achieving nearly 1000x acceleration compared to previous SOTA.

Segment Any 3D Gaussians paper page: Interactive 3D segmentation in radiance fields is an appealing task since its importance in 3D scene understanding and manipulation. However, existing methods face challenges in either achieving fine-grained, multi-granularity segmentation or contending with substantial computational overhead, inhibiting real-time interaction. In this paper, we introduce Segment Any 3D GAussians (SAGA), a novel 3D interactive segmentation approach that seamlessly blends a 2D segmentation foundation model with 3D Gaussian Splatting (3DGS), a recent breakthrough of radiance fields. SAGA efficiently embeds multi-granularity 2D segmentation results generated by the segmentation foundation model into 3D Gaussian point features through well-designed contrastive training. Evaluation on existing benchmarks demonstrates that SAGA can achieve competitive performance with state-of-the-art methods. Moreover, SAGA achieves multi-granularity segmentation and accommodates various prompts, including points, scribbles, and 2D masks. Notably, SAGA can finish the 3D segmentation within milliseconds, achieving nearly 1000x acceleration compared to previous SOTA.

AK

69,542 次观看 • 2 年前

MoE-GS: Mixture of Experts for Dynamic Gaussian Splatting Contributions: • MoE-GS: the first dynamic Gaussian splatting framework employing a Mixture-of-Experts architecture, enabling robust and adaptive reconstruction across diverse dynamic scenes. • A novel Volume-aware Pixel Router integrates expert outputs through differentiable weight splatting, achieving spatially and temporally coherent adaptive blending. • Efficiency of MoE-GS is improved through single-pass multi-expert rendering and gate-aware Gaussian pruning. A separate knowledge distillation strategy trains individual experts with pseudo-labels from the MoE model, enhancing quality without modifying the architecture.

MoE-GS: Mixture of Experts for Dynamic Gaussian Splatting Contributions: • MoE-GS: the first dynamic Gaussian splatting framework employing a Mixture-of-Experts architecture, enabling robust and adaptive reconstruction across diverse dynamic scenes. • A novel Volume-aware Pixel Router integrates expert outputs through differentiable weight splatting, achieving spatially and temporally coherent adaptive blending. • Efficiency of MoE-GS is improved through single-pass multi-expert rendering and gate-aware Gaussian pruning. A separate knowledge distillation strategy trains individual experts with pseudo-labels from the MoE model, enhancing quality without modifying the architecture.

MrNeRF

10,346 次观看 • 8 个月前

Triangle Splatting for Real-Time Radiance Field Rendering Contributions: (i) We propose Triangle Splatting, a novel approach that directly optimizes unstructured triangles, bridging traditional computer graphics and radiance fields. (ii) We introduce a differentiable window function for soft triangle boundaries, enabling effective gradient flow. (iii) We demonstrate qualitatively and quantitatively that Triangle Splatting outperforms concurrent methods in terms of visual quality and rendering speed, and achieves superior perceptual quality compared to the state-of-the-art Zip-NeRF on indoor scenes. (iv) The optimized triangles are directly compatible with standard mesh-based renderers, enabling seamless integration into traditional graphics pipelines.

Triangle Splatting for Real-Time Radiance Field Rendering Contributions: (i) We propose Triangle Splatting, a novel approach that directly optimizes unstructured triangles, bridging traditional computer graphics and radiance fields. (ii) We introduce a differentiable window function for soft triangle boundaries, enabling effective gradient flow. (iii) We demonstrate qualitatively and quantitatively that Triangle Splatting outperforms concurrent methods in terms of visual quality and rendering speed, and achieves superior perceptual quality compared to the state-of-the-art Zip-NeRF on indoor scenes. (iv) The optimized triangles are directly compatible with standard mesh-based renderers, enabling seamless integration into traditional graphics pipelines.

MrNeRF

51,407 次观看 • 1 年前

Gaussian Head Avatar: Ultra High-fidelity Head Avatar via Dynamic Gaussians paper page: Creating high-fidelity 3D head avatars has always been a research hotspot, but there remains a great challenge under lightweight sparse view setups. In this paper, we propose Gaussian Head Avatar represented by controllable 3D Gaussians for high-fidelity head avatar modeling. We optimize the neutral 3D Gaussians and a fully learned MLP-based deformation field to capture complex expressions. The two parts benefit each other, thereby our method can model fine-grained dynamic details while ensuring expression accuracy. Furthermore, we devise a well-designed geometry-guided initialization strategy based on implicit SDF and Deep Marching Tetrahedra for the stability and convergence of the training procedure. Experiments show our approach outperforms other state-of-the-art sparse-view methods, achieving ultra high-fidelity rendering quality at 2K resolution even under exaggerated expressions.

AK

65,834 次观看 • 2 年前

I always thought camera pose estimation is necessary for 3D reconstruction, until Zequn and Stephen proved me wrong! Introducing PreF3R, purely feed-forward 3D Gaussian Splatting without any intermediate pose estimation and COLMAP initialization. Video in, 3D Gaussians and novel view rendering out. Key technique: spatial memory network from Spann3R and Gaussian head supervised by pointmap loss plus photometric loss. See more at:

I always thought camera pose estimation is necessary for 3D reconstruction, until Zequn and Stephen proved me wrong! Introducing PreF3R, purely feed-forward 3D Gaussian Splatting without any intermediate pose estimation and COLMAP initialization. Video in, 3D Gaussians and novel view rendering out. Key technique: spatial memory network from Spann3R and Gaussian head supervised by pointmap loss plus photometric loss. See more at:

Heng Yang

20,160 次观看 • 1 年前

[SIGGRAPH ASIA '25] Detail-Enhanced Gaussian Splatting for Large-Scale Volumetric Capture Contributions: - A two-stage approach to performance capture, combining a scene-scale capture rig and a single-actor facial capture rig. - A novel high-quality scene-scale volumetric performance capture rig, incorporating both static and dynamic cameras to track the performance of multiple actors. - A reconstruction pipeline for dynamic performance capture, featuring stable calibration of moving cameras and 4DGS with improved dynamic range and color fidelity. - A detail enhancement Diffusion Model, which supports 4K, RGB, and Alpha, with improved temporal stability.

[SIGGRAPH ASIA '25] Detail-Enhanced Gaussian Splatting for Large-Scale Volumetric Capture Contributions: - A two-stage approach to performance capture, combining a scene-scale capture rig and a single-actor facial capture rig. - A novel high-quality scene-scale volumetric performance capture rig, incorporating both static and dynamic cameras to track the performance of multiple actors. - A reconstruction pipeline for dynamic performance capture, featuring stable calibration of moving cameras and 4DGS with improved dynamic range and color fidelity. - A detail enhancement Diffusion Model, which supports 4K, RGB, and Alpha, with improved temporal stability.

MrNeRF

42,382 次观看 • 8 个月前

Apple just trained a 3D Gaussian head reconstruction model on 10,000+ subjects. Feed-forward. No test-time optimization. New identity in, reconstructed Gaussian head out. The UV-parameterized Gaussian representation decouples the number of Gaussians from the number and resolution of input images, making it practical to train with many high resolution views. And the heads are not just static either: text-conditioned identity generation, plus blendshape-driven latent animation across identities. We've been building in the 3D Gaussian Splatting space for a while. The gap between "research demo" and "works on real people at scale" is closing fast.

Apple just trained a 3D Gaussian head reconstruction model on 10,000+ subjects. Feed-forward. No test-time optimization. New identity in, reconstructed Gaussian head out. The UV-parameterized Gaussian representation decouples the number of Gaussians from the number and resolution of input images, making it practical to train with many high resolution views. And the heads are not just static either: text-conditioned identity generation, plus blendshape-driven latent animation across identities. We've been building in the 3D Gaussian Splatting space for a while. The gap between "research demo" and "works on real people at scale" is closing fast.

KIRI Engine - 3D Scanner App

12,013 次观看 • 1 个月前

📢We introduce “RefFusion”, a novel inpainting method for scenes reconstructed using 3D Gaussian Splatting. 🔗 TLDR: we personalize an image diffusion model to a given reference image and distill its knowledge to 3D through score distillation sampling.

📢We introduce “RefFusion”, a novel inpainting method for scenes reconstructed using 3D Gaussian Splatting. 🔗 TLDR: we personalize an image diffusion model to a given reference image and distill its knowledge to 3D through score distillation sampling.

Ashkan Mirzaei

34,700 次观看 • 2 年前

F3D-Gaus: Feed-forward 3D-aware Generation on ImageNet with Cycle-Consistent Gaussian Splatting Contributions: • We pioneer 3D-aware generation using generalizable feed-forward Gaussian Splatting representation, achieving significant efficiency and favorable rendering quality on monocular datasets. • We significantly advance the capability of pixel-aligned Gaussian Splatting representations by designing a self-supervised cycle training strategy specifically tailored for monocular datasets. • We further mitigate the artifacts of 3D-aware representations caused by large viewpoint shifts by introducing geometry-aware video priors.

F3D-Gaus: Feed-forward 3D-aware Generation on ImageNet with Cycle-Consistent Gaussian Splatting Contributions: • We pioneer 3D-aware generation using generalizable feed-forward Gaussian Splatting representation, achieving significant efficiency and favorable rendering quality on monocular datasets. • We significantly advance the capability of pixel-aligned Gaussian Splatting representations by designing a self-supervised cycle training strategy specifically tailored for monocular datasets. • We further mitigate the artifacts of 3D-aware representations caused by large viewpoint shifts by introducing geometry-aware video priors.

MrNeRF

14,229 次观看 • 1 年前

We are excited to introduce Stable Fast 3D, Stability AI’s latest breakthrough in 3D asset generation technology. This innovative model transforms a single input image into a detailed 3D asset in just 0.5 seconds, setting a new standard for speed and quality in the field of 3D reconstruction! Alongside this release, we’ve also published a technical report that highlights how we achieve fast inference speeds with reduced baked illumination and material parameters. 👾You can learn more and access the report here:

We are excited to introduce Stable Fast 3D, Stability AI’s latest breakthrough in 3D asset generation technology. This innovative model transforms a single input image into a detailed 3D asset in just 0.5 seconds, setting a new standard for speed and quality in the field of 3D reconstruction! Alongside this release, we’ve also published a technical report that highlights how we achieve fast inference speeds with reduced baked illumination and material parameters. 👾You can learn more and access the report here:

Stability AI

438,350 次观看 • 1 年前

✨ Any static 3D assets ➡️ 4D dynamic worlds. Introducing CHORD, a universal framework for generating scene-level 4D dynamic motion from any static 3D inputs. It generalizes surprisingly well across a wide range of objects 🤯 and can even be used to learn robotics manipulation policy 🤖! Project page: Dive deeper in a 🧵: 1/n

✨ Any static 3D assets ➡️ 4D dynamic worlds. Introducing CHORD, a universal framework for generating scene-level 4D dynamic motion from any static 3D inputs. It generalizes surprisingly well across a wide range of objects 🤯 and can even be used to learn robotics manipulation policy 🤖! Project page: Dive deeper in a 🧵: 1/n

Chen Geng

43,033 次观看 • 5 个月前

WeatherEdit: Controllable Weather Editing with 4D Gaussian Field Contributions: 1. Based on our analysis of weather editing characteristics, we introduce WeatherEdit, a comprehensive and efficient framework for realistic and controllable weather generation. Compared with existing methods that focus on either background editing or static weather effects, a progressive 2D-to-4D transformation process in WeatherEdit enhances adaptability across a wider range of scenarios. 2. We introduce an all-in-one adapter to enable a diffusion model for multi-weather (snowy, rainy, and fog) synthesis, along with a Temporal-View attention to ensure consistent editing across multi-frame and multi-view. 3. We design a 4D Gaussian field for weather particle modeling, enabling plausible simulation of raindrops, snowflakes, and fog with controllable severity. 4. We demonstrate WeatherEdit’s effectiveness in generating realistic, consistent, and controllable weather effects in 3D driving scenes, showcasing its applicability to real-world scenarios.

WeatherEdit: Controllable Weather Editing with 4D Gaussian Field Contributions: 1. Based on our analysis of weather editing characteristics, we introduce WeatherEdit, a comprehensive and efficient framework for realistic and controllable weather generation. Compared with existing methods that focus on either background editing or static weather effects, a progressive 2D-to-4D transformation process in WeatherEdit enhances adaptability across a wider range of scenarios. 2. We introduce an all-in-one adapter to enable a diffusion model for multi-weather (snowy, rainy, and fog) synthesis, along with a Temporal-View attention to ensure consistent editing across multi-frame and multi-view. 3. We design a 4D Gaussian field for weather particle modeling, enabling plausible simulation of raindrops, snowflakes, and fog with controllable severity. 4. We demonstrate WeatherEdit’s effectiveness in generating realistic, consistent, and controllable weather effects in 3D driving scenes, showcasing its applicability to real-world scenarios.

MrNeRF

10,607 次观看 • 1 年前