正在加载视频...

视频加载失败

WorldExplorer: Towards Generating Fully Navigable 3D Scenes Contributions: • We introduce the first method for generating 3D scenes from text that supports high-quality view synthesis while enabling exploration across a wide range of camera poses. • We propose an iterative scene expansion strategy using video diffusion models, driven by...

23,814 次观看 • 1 年前 •via X (Twitter)

9 条评论

MrNeRF 的头像
MrNeRF1 年前

Paper: Project: YouTube:

VistaShares 的头像
VistaShares1 年前

From semiconductors to data centers, AIS targets the critical components behind AI's exponential growth. Capture potential returns from this transformative technology sector.

MrNeRF 的头像
MrNeRF1 年前

I'm crafting an email newsletter that turns my daily updates into a captivating weekly digest, complete with exclusive content. Although it's not live yet, you can sign up now! If you're curious, visit my website and join the subscriber list today!

Tom Bielecki 的头像
Tom Bielecki1 年前

We’re getting so close to Black Mirror :) Have you seen any method that blends Gaussian Splatting with raytracing somehow?

MrNeRF 的头像
MrNeRF1 年前

Sure! Either or

Shoubhik 的头像
Shoubhik1 年前

will the code be released soon?

MrNeRF 的头像
MrNeRF1 年前

🤷

SaraIverson 的头像
SaraIverson1 年前

Wow, this is next-level! @alexgraytrust’s insights on emerging tech trends really help put innovations like this into perspective. Excited to see where this 3D scene generation goes—could be a game-changer for so many industries!

Peter 的头像
Peter1 年前

frame rate is very low!

相关视频

🚨 SIGGRAPH Asia 2025 Paper Alert 🚨 ➡️Paper Title: WorldExplorer: Towards Generating Fully Navigable 3D Scenes 🌟Few pointers from the paper 🎯Generating 3D worlds from text is a highly anticipated goal in computer vision. Existing works are limited by the degree of exploration they allow inside of a scene, i.e., produce stretched-out and noisy artifacts when moving beyond central or panoramic perspectives. 🎯 To this end, authors of this paper proposed “WorldExplorer”, a novel method based on autoregressive video trajectory generation, which builds fully navigable 3D scenes with consistent visual quality across a wide range of viewpoints. 🎯They initialize their scenes by creating multi-view consistent images corresponding to a 360 degree panorama. 🎯Then, they expanded it by leveraging video diffusion models in an iterative scene generation pipeline. 🎯Concretely, they generated multiple videos along short, pre-defined trajectories, that explore the scene in depth, including motion around objects. 🎯Their novel scene memory conditions each video on the most relevant prior views, while a collision-detection mechanism prevents degenerate results, like moving into objects. 🎯Finally,they fuse all generated views into a unified 3D representation via 3D Gaussian Splatting optimization. 🎯Compared to prior approaches, WorldExplorer produces high-quality scenes that remain stable under large camera motion, enabling for the first time realistic and unrestricted exploration. 🎯They believe this marks a significant step toward generating immersive and truly explorable virtual 3D environments. 🏢Organization: TU München 🧙Paper Authors: Manuel-Andreas Schneider, Lukas Höllein , Matthias Niessner 📝 Read the Full Paper here: 🗂️ Project Page: 🧑‍💻 Code: 🎥 Be sure to watch the attached Technical Summary Video - Sound on 🔊🔊 Find this Valuable 💎 ? ♻️QT and teach your network something new Follow me 👣, naveen manwani , for the latest updates on Tech and AI-related news, insightful research papers, and exciting announcements. #SIGGRAPHAsia2025

naveen manwani

10,578 次观看 • 8 个月前

DreamCraft3D: Hierarchical 3D Generation with Bootstrapped Diffusion Prior paper page: present DreamCraft3D, a hierarchical 3D content generation method that produces high-fidelity and coherent 3D objects. We tackle the problem by leveraging a 2D reference image to guide the stages of geometry sculpting and texture boosting. A central focus of this work is to address the consistency issue that existing works encounter. To sculpt geometries that render coherently, we perform score distillation sampling via a view-dependent diffusion model. This 3D prior, alongside several training strategies, prioritizes the geometry consistency but compromises the texture fidelity. We further propose Bootstrapped Score Distillation to specifically boost the texture. We train a personalized diffusion model, Dreambooth, on the augmented renderings of the scene, imbuing it with 3D knowledge of the scene being optimized. The score distillation from this 3D-aware diffusion prior provides view-consistent guidance for the scene. Notably, through an alternating optimization of the diffusion prior and 3D scene representation, we achieve mutually reinforcing improvements: the optimized 3D scene aids in training the scene-specific diffusion model, which offers increasingly view-consistent guidance for 3D optimization. The optimization is thus bootstrapped and leads to substantial texture boosting. With tailored 3D priors throughout the hierarchical generation, DreamCraft3D generates coherent 3D objects with photorealistic renderings, advancing the state-of-the-art in 3D content generation.

AK

161,400 次观看 • 2 年前

Blended-NeRF: Zero-Shot Object Generation and Blending in Existing Neural Radiance Fields paper page: Editing a local region or a specific object in a 3D scene represented by a NeRF is challenging, mainly due to the implicit nature of the scene representation. Consistently blending a new realistic object into the scene adds an additional level of difficulty. We present Blended-NeRF, a robust and flexible framework for editing a specific region of interest in an existing NeRF scene, based on text prompts or image patches, along with a 3D ROI box. Our method leverages a pretrained language-image model to steer the synthesis towards a user-provided text prompt or image patch, along with a 3D MLP model initialized on an existing NeRF scene to generate the object and blend it into a specified region in the original scene. We allow local editing by localizing a 3D ROI box in the input scene, and seamlessly blend the content synthesized inside the ROI with the existing scene using a novel volumetric blending technique. To obtain natural looking and view-consistent results, we leverage existing and new geometric priors and 3D augmentations for improving the visual fidelity of the final result. We test our framework both qualitatively and quantitatively on a variety of real 3D scenes and text prompts, demonstrating realistic multi-view consistent results with much flexibility and diversity compared to the baselines. Finally, we show the applicability of our framework for several 3D editing applications, including adding new objects to a scene, removing/replacing/altering existing objects, and texture conversion.

AK

62,768 次观看 • 3 年前