Video yükleniyor...

Video Yüklenemedi

Bu video yüklenirken bir sorun oluştu. Bu geçici bir ağ sorunundan kaynaklanıyor olabilir veya video kullanılamıyor olabilir.

Ana Sayfaya Dön

Most generative models predict pixels. Predicting a 3D scene instead has many benefits: the scene won’t change if you look away and come back, and it obeys the basic physical rules of 3D geometry. The simplest way to visualize the 3D scene is a depth map, where each pixel... show more

World Labs

47,349 subscribers

16,605 görüntüleme • 1 yıl önce •via X (Twitter)

Bilim & Teknoloji

Anya Rossi• Live Now

Private livecam show

9 Yorum

World Labs profil fotoğrafı

World Labs1 yıl önce

We’ve been busy building an AI system to generate 3D worlds from a single image. Check out some early results on our site, where you can interact with our scenes directly in the browser! 1/n

World Labs profil fotoğrafı

World Labs1 yıl önce

World Labs aims to address the challenges many creators face with existing genAI models: a lack of control and consistency. Given an input image, our system estimates 3D geometry, fills in unseen parts of the scene, invents new content so you can turn around, and generalizes to a wide variety of scene types and artistic styles. 2/n

World Labs profil fotoğrafı

World Labs1 yıl önce

Our output 3D scenes can be rendered in real-time in the browser with full camera control. This means you can explore them with a freely moving camera like in a videogame, or even simulate 3D camera effects like shallow depth of field or dolly zoom. 3/n

World Labs profil fotoğrafı

World Labs1 yıl önce

Generating consistent 3D geometry allows us to interact with the scene in 3D-aware ways, like changing the scene’s lighting and appearance, modifying the geometry, or inserting other objects into the scene. 5/n

World Labs profil fotoğrafı

World Labs1 yıl önce

We also had some fun peeking into the worlds behind a few creative masterpieces, like the neighborhood surrounding the diner in Edward Hopper’s iconic painting Nighthawks. 6/n

World Labs profil fotoğrafı

World Labs1 yıl önce

3D world generation naturally composes with other AI tools. This allows creators to work with tools they already know to enable new experiences. We've given a few creators an early sneak peek at our technology to begin experimenting with the possibilities enabled by a 3D-native generative AI workflow. 7/n

World Labs profil fotoğrafı

World Labs1 yıl önce

shows how our models fill a gap in his creative workflow, making it easy to stage characters within scenes and direct precise camera movements. 8/n

World Labs profil fotoğrafı

World Labs1 yıl önce

This is just a glimpse into the future of 3D native generative AI – we’re working hard to put this tech into users’ hands as soon as possible! Stay tuned for future releases by signing up at or get in touch directly at [email protected]. n/n

manee_az profil fotoğrafı

manee_az1 yıl önce

Scene Graphs ftw! 😄

Benzer Videolar

Bring your stories to life with a 3D camera. Start with a single frame and turn it into a 3D scene you can move through, shot by shot. Control the camera. Set the pace. Shape the story.

Bring your stories to life with a 3D camera. Start with a single frame and turn it into a 3D scene you can move through, shot by shot. Control the camera. Set the pace. Shape the story.

Moonvalley

18,713 görüntüleme • 11 ay önce

DreamCraft3D: Hierarchical 3D Generation with Bootstrapped Diffusion Prior paper page: present DreamCraft3D, a hierarchical 3D content generation method that produces high-fidelity and coherent 3D objects. We tackle the problem by leveraging a 2D reference image to guide the stages of geometry sculpting and texture boosting. A central focus of this work is to address the consistency issue that existing works encounter. To sculpt geometries that render coherently, we perform score distillation sampling via a view-dependent diffusion model. This 3D prior, alongside several training strategies, prioritizes the geometry consistency but compromises the texture fidelity. We further propose Bootstrapped Score Distillation to specifically boost the texture. We train a personalized diffusion model, Dreambooth, on the augmented renderings of the scene, imbuing it with 3D knowledge of the scene being optimized. The score distillation from this 3D-aware diffusion prior provides view-consistent guidance for the scene. Notably, through an alternating optimization of the diffusion prior and 3D scene representation, we achieve mutually reinforcing improvements: the optimized 3D scene aids in training the scene-specific diffusion model, which offers increasingly view-consistent guidance for 3D optimization. The optimization is thus bootstrapped and leads to substantial texture boosting. With tailored 3D priors throughout the hierarchical generation, DreamCraft3D generates coherent 3D objects with photorealistic renderings, advancing the state-of-the-art in 3D content generation.

DreamCraft3D: Hierarchical 3D Generation with Bootstrapped Diffusion Prior paper page: present DreamCraft3D, a hierarchical 3D content generation method that produces high-fidelity and coherent 3D objects. We tackle the problem by leveraging a 2D reference image to guide the stages of geometry sculpting and texture boosting. A central focus of this work is to address the consistency issue that existing works encounter. To sculpt geometries that render coherently, we perform score distillation sampling via a view-dependent diffusion model. This 3D prior, alongside several training strategies, prioritizes the geometry consistency but compromises the texture fidelity. We further propose Bootstrapped Score Distillation to specifically boost the texture. We train a personalized diffusion model, Dreambooth, on the augmented renderings of the scene, imbuing it with 3D knowledge of the scene being optimized. The score distillation from this 3D-aware diffusion prior provides view-consistent guidance for the scene. Notably, through an alternating optimization of the diffusion prior and 3D scene representation, we achieve mutually reinforcing improvements: the optimized 3D scene aids in training the scene-specific diffusion model, which offers increasingly view-consistent guidance for 3D optimization. The optimization is thus bootstrapped and leads to substantial texture boosting. With tailored 3D priors throughout the hierarchical generation, DreamCraft3D generates coherent 3D objects with photorealistic renderings, advancing the state-of-the-art in 3D content generation.

AK

161,530 görüntüleme • 2 yıl önce

Yes, AI is here and 3D artists have to adapt. Here's one way I'm integrating AI with my 3D work, as a set extension. The BG is generated by Kling 3.0 and brought into my 3D scene so I can fully control the timing of the lights in the FG, which is 3D. Seedance 2.0 next!!

Yes, AI is here and 3D artists have to adapt. Here's one way I'm integrating AI with my 3D work, as a set extension. The BG is generated by Kling 3.0 and brought into my 3D scene so I can fully control the timing of the lights in the FG, which is 3D. Seedance 2.0 next!!

David Ariew

29,651 görüntüleme • 5 ay önce

Christian Rupprecht explains their interpretability research in 3D computer vision, testing if (and where in the model) multi-view transformers like VGGT, DepthAnything 3, and DUSt3R use point/patch correspondences to make sense of 3D scene geometry.

Christian Rupprecht explains their interpretability research in 3D computer vision, testing if (and where in the model) multi-view transformers like VGGT, DepthAnything 3, and DUSt3R use point/patch correspondences to make sense of 3D scene geometry.

Chris Offner

74,403 görüntüleme • 3 ay önce

🚀 Announcing Echo — our new frontier model for 3D world generation. Echo turns a simple text prompt or image into a fully explorable, 3D-consistent world. Instead of disconnected views, the result is a single, coherent spatial representation you can move through freely. This is part of a bigger shift in AI: from generating pixels and tokens to generating spaces. Echo predicts a geometry-grounded 3D scene at metric scale, meaning every novel view, depth map, and interaction comes from the same underlying world — not independent hallucinations. Once generated, the world is interactive in real time. You control the camera, explore from any angle, and render instantly — even on low-end hardware, directly in the browser. High-quality 3D world exploration is no longer gated by expensive equipment. Under the hood, Echo infers a physically grounded 3D representation and converts it into a renderable format. For our web demo, we use 3D Gaussian Splatting (3DGS) for fast, GPU-friendly rendering — but the representation itself is flexible and can be easily adapted. Why this matters: consistent 3D worlds unlock real workflows — digital twins, 3D design, game environments, robotics simulation, and more. From a single photo or a line of text, Echo builds worlds that are reliable, editable, and spatially faithful. Echo also enables scene editing and restyling. Change materials, remove or add objects, explore design variations — all while preserving global 3D consistency. Editing no longer breaks the world. This is only the beginning. Echo is the foundation for future world models with dynamics, physical reasoning, and richer interaction — environments that don’t just look right, but behave right. Explore the generated worlds on our website and sign up for the closed beta. The era of spatial intelligence starts here. 🌍 #Echo #WorldModels #SpatialAI #3DFoundationModels Check it out:

🚀 Announcing Echo — our new frontier model for 3D world generation. Echo turns a simple text prompt or image into a fully explorable, 3D-consistent world. Instead of disconnected views, the result is a single, coherent spatial representation you can move through freely. This is part of a bigger shift in AI: from generating pixels and tokens to generating spaces. Echo predicts a geometry-grounded 3D scene at metric scale, meaning every novel view, depth map, and interaction comes from the same underlying world — not independent hallucinations. Once generated, the world is interactive in real time. You control the camera, explore from any angle, and render instantly — even on low-end hardware, directly in the browser. High-quality 3D world exploration is no longer gated by expensive equipment. Under the hood, Echo infers a physically grounded 3D representation and converts it into a renderable format. For our web demo, we use 3D Gaussian Splatting (3DGS) for fast, GPU-friendly rendering — but the representation itself is flexible and can be easily adapted. Why this matters: consistent 3D worlds unlock real workflows — digital twins, 3D design, game environments, robotics simulation, and more. From a single photo or a line of text, Echo builds worlds that are reliable, editable, and spatially faithful. Echo also enables scene editing and restyling. Change materials, remove or add objects, explore design variations — all while preserving global 3D consistency. Editing no longer breaks the world. This is only the beginning. Echo is the foundation for future world models with dynamics, physical reasoning, and richer interaction — environments that don’t just look right, but behave right. Explore the generated worlds on our website and sign up for the closed beta. The era of spatial intelligence starts here. 🌍 #Echo #WorldModels #SpatialAI #3DFoundationModels Check it out:

SpAItial AI

176,105 görüntüleme • 7 ay önce

3D scene reconstructions by NVIDIA. ArtiFixer - repairs artifacts and extends sparse views via Wan 2.1. - high-fidelity inpainting in occluded regions - gens hundreds of consistent frames in a single pass - 3D Gaussian Splatting for navigable scene reconstruction Makes the 3D environment look photorealistic and fully navigable for VR/AR. It basically turns a broken 3D model into a polished, professional scene.

3D scene reconstructions by NVIDIA. ArtiFixer - repairs artifacts and extends sparse views via Wan 2.1. - high-fidelity inpainting in occluded regions - gens hundreds of consistent frames in a single pass - 3D Gaussian Splatting for navigable scene reconstruction Makes the 3D environment look photorealistic and fully navigable for VR/AR. It basically turns a broken 3D model into a polished, professional scene.

Wildminder

15,711 görüntüleme • 1 ay önce

The challenge of creating this scene was from a 3D environment to a 2D painted environment. We used the character to bait-and-switch out the backgrounds. #2d #3d #animation

The challenge of creating this scene was from a 3D environment to a 2D painted environment. We used the character to bait-and-switch out the backgrounds. #2d #3d #animation

THE LINE

696,584 görüntüleme • 3 yıl önce

Volumetric, animated 3D noise shader block in Unity. Where each frame is 2D space, and the 3rd dimension is time. So that, measurement of change within a slice is 'space', and measurement of change across slices is 'time'. As a 3D block, you see the noise animate into the 3rd dimension, but when looking at only a single slice, it appears to evolve in place. Likewise, for 4D... If: this were 3D noise, then: the 4th dimension would be whatever processing/frames/states of 3D data.

Volumetric, animated 3D noise shader block in Unity. Where each frame is 2D space, and the 3rd dimension is time. So that, measurement of change within a slice is 'space', and measurement of change across slices is 'time'. As a 3D block, you see the noise animate into the 3rd dimension, but when looking at only a single slice, it appears to evolve in place. Likewise, for 4D... If: this were 3D noise, then: the 4th dimension would be whatever processing/frames/states of 3D data.

Mirza Beig

126,907 görüntüleme • 7 ay önce

One of the hardest things to achieve with AI is precise character motion. The new model by Kinetix, Kamo-1, is amazing at giving you far more control over your generations. It’s also the first 3D-conditioned model, so it understands the scene in 3D and gives you almost unlimited camera motion. Let me show you how to use it 👇

One of the hardest things to achieve with AI is precise character motion. The new model by Kinetix, Kamo-1, is amazing at giving you far more control over your generations. It’s also the first 3D-conditioned model, so it understands the scene in 3D and gives you almost unlimited camera motion. Let me show you how to use it 👇

Everett World

19,199 görüntüleme • 7 ay önce

Project Light Touch from Adobe enables you to modify the lighting in any image. You can move the lighting anywhere within a scene in 3D space, adjusting its color, position, depth, and brightness.

Project Light Touch from Adobe enables you to modify the lighting in any image. You can move the lighting anywhere within a scene in 3D space, adjusting its color, position, depth, and brightness.

CHRIS FIRST

107,394 görüntüleme • 8 ay önce

InstantSplat++ is now open source. It is a lightweight library that connects foundation models (VGGT, MASt3R, MAP-Anything, etc.) with the Gaussian splatting family. Given uncalibrated images, it optimizes a 3D scene in a few seconds. Try the demo and code here:

InstantSplat++ is now open source. It is a lightweight library that connects foundation models (VGGT, MASt3R, MAP-Anything, etc.) with the Gaussian splatting family. Given uncalibrated images, it optimizes a 3D scene in a few seconds. Try the demo and code here:

Zhiwen(Aaron) Fan

31,864 görüntüleme • 4 ay önce

Blended-NeRF: Zero-Shot Object Generation and Blending in Existing Neural Radiance Fields paper page: Editing a local region or a specific object in a 3D scene represented by a NeRF is challenging, mainly due to the implicit nature of the scene representation. Consistently blending a new realistic object into the scene adds an additional level of difficulty. We present Blended-NeRF, a robust and flexible framework for editing a specific region of interest in an existing NeRF scene, based on text prompts or image patches, along with a 3D ROI box. Our method leverages a pretrained language-image model to steer the synthesis towards a user-provided text prompt or image patch, along with a 3D MLP model initialized on an existing NeRF scene to generate the object and blend it into a specified region in the original scene. We allow local editing by localizing a 3D ROI box in the input scene, and seamlessly blend the content synthesized inside the ROI with the existing scene using a novel volumetric blending technique. To obtain natural looking and view-consistent results, we leverage existing and new geometric priors and 3D augmentations for improving the visual fidelity of the final result. We test our framework both qualitatively and quantitatively on a variety of real 3D scenes and text prompts, demonstrating realistic multi-view consistent results with much flexibility and diversity compared to the baselines. Finally, we show the applicability of our framework for several 3D editing applications, including adding new objects to a scene, removing/replacing/altering existing objects, and texture conversion.

Blended-NeRF: Zero-Shot Object Generation and Blending in Existing Neural Radiance Fields paper page: Editing a local region or a specific object in a 3D scene represented by a NeRF is challenging, mainly due to the implicit nature of the scene representation. Consistently blending a new realistic object into the scene adds an additional level of difficulty. We present Blended-NeRF, a robust and flexible framework for editing a specific region of interest in an existing NeRF scene, based on text prompts or image patches, along with a 3D ROI box. Our method leverages a pretrained language-image model to steer the synthesis towards a user-provided text prompt or image patch, along with a 3D MLP model initialized on an existing NeRF scene to generate the object and blend it into a specified region in the original scene. We allow local editing by localizing a 3D ROI box in the input scene, and seamlessly blend the content synthesized inside the ROI with the existing scene using a novel volumetric blending technique. To obtain natural looking and view-consistent results, we leverage existing and new geometric priors and 3D augmentations for improving the visual fidelity of the final result. We test our framework both qualitatively and quantitatively on a variety of real 3D scenes and text prompts, demonstrating realistic multi-view consistent results with much flexibility and diversity compared to the baselines. Finally, we show the applicability of our framework for several 3D editing applications, including adding new objects to a scene, removing/replacing/altering existing objects, and texture conversion.

AK

62,768 görüntüleme • 3 yıl önce

I'm genuinely blown away by this. The leap from text descriptions straight to 3D models? It's next-level. Think about the possibility: a stream of prompts turns into a treasure trove of 3D pieces. Gather them, and you've got a full scene ready to come to life. The thought alone was exciting, but seeing it in action? Mind-bending. Take user 'NeonGlitch86' for instance. They've crafted an entire miniature world using nothing but Luma AI's #Genie text-to-3D tool combined with Blender's magic. "Totally geeking out with Luma AI. Today's project was a scene built from scratch with generated 3D models. Slapped some Mixamo rigs onto my characters and placed them into Blender. It's like a playground for bringing scene sketches to life." - NeonGlitch86

I'm genuinely blown away by this. The leap from text descriptions straight to 3D models? It's next-level. Think about the possibility: a stream of prompts turns into a treasure trove of 3D pieces. Gather them, and you've got a full scene ready to come to life. The thought alone was exciting, but seeing it in action? Mind-bending. Take user 'NeonGlitch86' for instance. They've crafted an entire miniature world using nothing but Luma AI's #Genie text-to-3D tool combined with Blender's magic. "Totally geeking out with Luma AI. Today's project was a scene built from scratch with generated 3D models. Slapped some Mixamo rigs onto my characters and placed them into Blender. It's like a playground for bringing scene sketches to life." - NeonGlitch86

Linus ✦ Ekenstam

696,059 görüntüleme • 2 yıl önce

Omma AI is wild. You type one sentence and it can create: • a 3D scene • a full website • a working app At the same time. One agent writes code. One builds the 3D model. One designs the interface. That is a very different direction from most AI tools.

Omma AI is wild. You type one sentence and it can create: • a 3D scene • a full website • a working app At the same time. One agent writes code. One builds the 3D model. One designs the interface. That is a very different direction from most AI tools.

Julian Goldie SEO

11,982 görüntüleme • 3 ay önce

Every single final clip you see in this video was rendered with Ray3 Modify in Dream Machine by Luma. Each clip originating from a handcrafted 3D scene. This is a glimpses of the 3D/VFX workflows of the future, keeping the parts we love and making the whole process more efficient and limitless.

Every single final clip you see in this video was rendered with Ray3 Modify in Dream Machine by Luma. Each clip originating from a handcrafted 3D scene. This is a glimpses of the 3D/VFX workflows of the future, keeping the parts we love and making the whole process more efficient and limitless.

Luma | Dream Lab

32,065 görüntüleme • 6 ay önce

We now have aircraft tracking on our 3D incident map through our website. It is amazing to see the helicopters dip to a water source and then go to the fire perimeter, over and over again, in near real-time. Our teams are working around the clock to try and bring the best and accurate information to everyone. Southern California 3D scene:

We now have aircraft tracking on our 3D incident map through our website. It is amazing to see the helicopters dip to a water source and then go to the fire perimeter, over and over again, in near real-time. Our teams are working around the clock to try and bring the best and accurate information to everyone. Southern California 3D scene:

CAL FIRE LNU

128,014 görüntüleme • 1 yıl önce

Welcome to the 3D Layer of the Internet We are thrilled to share our latest features Prompt-to-3D engine and Scene Studio Your text based prompts and photos can now be transformed from simple concepts into fully-rendered, interactive 3D models. It parses complex topology, handles multi-view reconstruction, and even lets you restyle your model instantly with single-click filters like low-poly, voxel, and Voronoi shells. Within the all-new Scene Studio, you can construct immersive 3D environments that reflect your exact style. Position your assets in a full sandbox space, adjust advanced lighting and environment colors, and add the finishing touches with custom camera angles, fog settings, and real-time script parameters. This is the future of the open web. This is

Welcome to the 3D Layer of the Internet We are thrilled to share our latest features Prompt-to-3D engine and Scene Studio Your text based prompts and photos can now be transformed from simple concepts into fully-rendered, interactive 3D models. It parses complex topology, handles multi-view reconstruction, and even lets you restyle your model instantly with single-click filters like low-poly, voxel, and Voronoi shells. Within the all-new Scene Studio, you can construct immersive 3D environments that reflect your exact style. Position your assets in a full sandbox space, adjust advanced lighting and environment colors, and add the finishing touches with custom camera angles, fog settings, and real-time script parameters. This is the future of the open web. This is

three.ws

37,254 görüntüleme • 1 ay önce

Is 3D scene generation much closer to being solved all of a sudden? It has been a few days since the release of OpenAI Sora. We run our COLMAP-Free 3D Gaussian Splatting on the released videos. Our method does not need to pre-process cameras and it seems we can directly just get 3D from the videos. Check out our results here. 🧵👇 (1/n

Is 3D scene generation much closer to being solved all of a sudden? It has been a few days since the release of OpenAI Sora. We run our COLMAP-Free 3D Gaussian Splatting on the released videos. Our method does not need to pre-process cameras and it seems we can directly just get 3D from the videos. Check out our results here. 🧵👇 (1/n

Xiaolong Wang

157,395 görüntüleme • 2 yıl önce

Gaussian Shell Maps are a new neural scene representation that connects fields and 3D Gaussians. This representation unlocks the full potential of 3D Gaussian splatting for generative AI applications, such as 3D avatar generation. 1/2

Gordon Wetzstein

52,480 görüntüleme • 2 yıl önce

Introducing SAM 3D, the newest addition to the SAM collection, bringing common sense 3D understanding of everyday images. SAM 3D includes two models: 🛋️ SAM 3D Objects for object and scene reconstruction 🧑‍🤝‍🧑 SAM 3D Body for human pose and shape estimation Both models achieve state-of-the-art performance transforming static 2D images into vivid, accurate reconstructions. 🔗 Learn more:

Introducing SAM 3D, the newest addition to the SAM collection, bringing common sense 3D understanding of everyday images. SAM 3D includes two models: 🛋️ SAM 3D Objects for object and scene reconstruction 🧑‍🤝‍🧑 SAM 3D Body for human pose and shape estimation Both models achieve state-of-the-art performance transforming static 2D images into vivid, accurate reconstructions. 🔗 Learn more:

AI at Meta

859,247 görüntüleme • 8 ay önce