Video wird geladen...

Video konnte nicht geladen werden

Beim Laden dieses Videos ist ein Problem aufgetreten. Dies könnte an einem vorübergehenden Netzwerkproblem liegen oder das Video ist möglicherweise nicht verfügbar.

This AI paper just solved Google Earth's biggest problem. Satellites look down. Humans look across. That perspective gap is why 3D maps are limited to cities you can blanket with aerial flyovers. Skyfall-GS bridges the gap by synthesizing the views we never captured - rebuilding missing facades and street-level... show more

Bilawal Sidhu

104,214 subscribers

135,003 Aufrufe • vor 8 Monaten •via X (Twitter)

Wissenschaft & Technologie

Anya Rossi• Live Now

Private livecam show

0 Kommentare

Keine Kommentare verfügbar

Kommentare vom Original-Post werden hier angezeigt

Ähnliche Videos

So these researchers figured out you can basically hallucinate 3D cities into existence using just satellite photos & a diffusion model. The problem's pretty straightforward: satellites only see rooftops. Building facades? Invisible. Street-level detail? Doesn't exist. But people want flyable 3D environments, which means you need all that occluded geometry. When I worked on google maps photogrammetry, we could only use satellite-based 3D for isolated stuff like the pyramids - anything city-scale required airplane flyovers. Which is fine until you hit aerial-denied regions where you literally can't fly. Huge chunks of the world just unavailable. Their trick is honestly kind of beautiful. They train gaussian splats on satellite views, but as it descends toward ground level, the renders turn to absolute garbage - artifacts everywhere. Instead of fighting this, they just treat those nightmare renders as the input to a diffusion model. Basically - "hey FLUX, fix this mess." Then here's where it gets clever: they generate multiple diffusion samples per view instead of committing to one. Because any single denoising path is probably wrong in 3D space, but if you generate a couple and let the GS optimization find consensus across them, you get actual geometric consistency. They do this in episodes, curriculum style - start high, gradually descend (hence the name Skyfall-GS!). With each iteration the ground-level views get less fucked. By the end you've got real-time flyable cities that look surprisingly real, and the geometry still matches the satellite input. No 3D training data. No street-level photos. Just satellites + diffusion doing what it does best - filling in the blanks. It's like neural scene completion but actually practical, and it unlocks basically the entire world.

So these researchers figured out you can basically hallucinate 3D cities into existence using just satellite photos & a diffusion model. The problem's pretty straightforward: satellites only see rooftops. Building facades? Invisible. Street-level detail? Doesn't exist. But people want flyable 3D environments, which means you need all that occluded geometry. When I worked on google maps photogrammetry, we could only use satellite-based 3D for isolated stuff like the pyramids - anything city-scale required airplane flyovers. Which is fine until you hit aerial-denied regions where you literally can't fly. Huge chunks of the world just unavailable. Their trick is honestly kind of beautiful. They train gaussian splats on satellite views, but as it descends toward ground level, the renders turn to absolute garbage - artifacts everywhere. Instead of fighting this, they just treat those nightmare renders as the input to a diffusion model. Basically - "hey FLUX, fix this mess." Then here's where it gets clever: they generate multiple diffusion samples per view instead of committing to one. Because any single denoising path is probably wrong in 3D space, but if you generate a couple and let the GS optimization find consensus across them, you get actual geometric consistency. They do this in episodes, curriculum style - start high, gradually descend (hence the name Skyfall-GS!). With each iteration the ground-level views get less fucked. By the end you've got real-time flyable cities that look surprisingly real, and the geometry still matches the satellite input. No 3D training data. No street-level photos. Just satellites + diffusion doing what it does best - filling in the blanks. It's like neural scene completion but actually practical, and it unlocks basically the entire world.

Bilawal Sidhu

241,899 Aufrufe • vor 9 Monaten

This is pretty cool 👓 Google Announces Geospatial Creator powered by ARCore & Google Maps and it allows you to visualize, design, and publish world-anchored content by using Unity ! Few features available: 📌 Photorealistic 3D Tiles: Visualize the 3D geometry of the world and deploy location-anchored content accurately with high-resolution, photorealistic 3D tiles. 📌 Geospatial anchors: Place and anchor 3D content at any given latitude, longitude, and altitude with sub-meter accuracy in areas covered by Google Street View. 📌 Terrain anchors: Place and anchor 3D content with only latitude and longitude coordinates, using data from Google Maps to determine ground level. 📌 Rooftop anchors: Use rooftops to place and anchor 3D content with respect to the building geometry and terrain in areas covered by Google Street View. 📢 To get started with Unity follow these steps: #AR #Google

This is pretty cool 👓 Google Announces Geospatial Creator powered by ARCore & Google Maps and it allows you to visualize, design, and publish world-anchored content by using Unity ! Few features available: 📌 Photorealistic 3D Tiles: Visualize the 3D geometry of the world and deploy location-anchored content accurately with high-resolution, photorealistic 3D tiles. 📌 Geospatial anchors: Place and anchor 3D content at any given latitude, longitude, and altitude with sub-meter accuracy in areas covered by Google Street View. 📌 Terrain anchors: Place and anchor 3D content with only latitude and longitude coordinates, using data from Google Maps to determine ground level. 📌 Rooftop anchors: Use rooftops to place and anchor 3D content with respect to the building geometry and terrain in areas covered by Google Street View. 📢 To get started with Unity follow these steps: #AR #Google

Dilmer

58,292 Aufrufe • vor 3 Jahren

3D, AI, NeRFs and oh my! Immersive view is now rolling out to 5 cities, with 5 more to follow. Experience the best of all views, with helpful info layered on top, so you can decide when & where to go. Proud to be part of the team-of-teams that made this maps milestone a reality!

3D, AI, NeRFs and oh my! Immersive view is now rolling out to 5 cities, with 5 more to follow. Experience the best of all views, with helpful info layered on top, so you can decide when & where to go. Proud to be part of the team-of-teams that made this maps milestone a reality!

Bilawal Sidhu

318,440 Aufrufe • vor 3 Jahren

DreamCraft3D: Hierarchical 3D Generation with Bootstrapped Diffusion Prior paper page: present DreamCraft3D, a hierarchical 3D content generation method that produces high-fidelity and coherent 3D objects. We tackle the problem by leveraging a 2D reference image to guide the stages of geometry sculpting and texture boosting. A central focus of this work is to address the consistency issue that existing works encounter. To sculpt geometries that render coherently, we perform score distillation sampling via a view-dependent diffusion model. This 3D prior, alongside several training strategies, prioritizes the geometry consistency but compromises the texture fidelity. We further propose Bootstrapped Score Distillation to specifically boost the texture. We train a personalized diffusion model, Dreambooth, on the augmented renderings of the scene, imbuing it with 3D knowledge of the scene being optimized. The score distillation from this 3D-aware diffusion prior provides view-consistent guidance for the scene. Notably, through an alternating optimization of the diffusion prior and 3D scene representation, we achieve mutually reinforcing improvements: the optimized 3D scene aids in training the scene-specific diffusion model, which offers increasingly view-consistent guidance for 3D optimization. The optimization is thus bootstrapped and leads to substantial texture boosting. With tailored 3D priors throughout the hierarchical generation, DreamCraft3D generates coherent 3D objects with photorealistic renderings, advancing the state-of-the-art in 3D content generation.

DreamCraft3D: Hierarchical 3D Generation with Bootstrapped Diffusion Prior paper page: present DreamCraft3D, a hierarchical 3D content generation method that produces high-fidelity and coherent 3D objects. We tackle the problem by leveraging a 2D reference image to guide the stages of geometry sculpting and texture boosting. A central focus of this work is to address the consistency issue that existing works encounter. To sculpt geometries that render coherently, we perform score distillation sampling via a view-dependent diffusion model. This 3D prior, alongside several training strategies, prioritizes the geometry consistency but compromises the texture fidelity. We further propose Bootstrapped Score Distillation to specifically boost the texture. We train a personalized diffusion model, Dreambooth, on the augmented renderings of the scene, imbuing it with 3D knowledge of the scene being optimized. The score distillation from this 3D-aware diffusion prior provides view-consistent guidance for the scene. Notably, through an alternating optimization of the diffusion prior and 3D scene representation, we achieve mutually reinforcing improvements: the optimized 3D scene aids in training the scene-specific diffusion model, which offers increasingly view-consistent guidance for 3D optimization. The optimization is thus bootstrapped and leads to substantial texture boosting. With tailored 3D priors throughout the hierarchical generation, DreamCraft3D generates coherent 3D objects with photorealistic renderings, advancing the state-of-the-art in 3D content generation.

AK

161,530 Aufrufe • vor 2 Jahren

WOW. 😳 Apple just quietly won the 3D maps war at WWDC. Gaussian Splatting is coming to Apple Maps Flyover this fall. Apple Maps Flyover covers 300+ cities. Until yesterday, every single one was built on standard drone photogrammetry. The technology captures photos from the air and reconstructs 3D geometry from them. Gaussian Splatting does not reconstruct geometry. It represents the scene as millions of tiny 3D ellipsoids, each one carrying its own color and opacity information based on how light actually behaves in that location. The output is not a mesh model. It is a field of light. When you move through it, it does not crumble at the edges. The detail holds because it was never geometry to begin with. Apple has been hiring for this for years. Their SHARP model, published in research last year, generates photorealistic 3D scenes from a single image in under a second. Google has more sensor data than anyone. More Street View cars, more satellites, more capture history. On navigation accuracy and geodata depth, Google Maps is still ahead by most measures. But fidelity in 3D city rendering is a different competition, and Apple just set a bar in that. Most people will experience this in the fall without knowing the name of the technology. They will open Flyover, look at a city they know, and notice it looks different. Real, not rendered. That is the moment Gaussian Splatting stops being a research term and becomes something a billion people use. Bookmark this. It will look prescient by October.

WOW. 😳 Apple just quietly won the 3D maps war at WWDC. Gaussian Splatting is coming to Apple Maps Flyover this fall. Apple Maps Flyover covers 300+ cities. Until yesterday, every single one was built on standard drone photogrammetry. The technology captures photos from the air and reconstructs 3D geometry from them. Gaussian Splatting does not reconstruct geometry. It represents the scene as millions of tiny 3D ellipsoids, each one carrying its own color and opacity information based on how light actually behaves in that location. The output is not a mesh model. It is a field of light. When you move through it, it does not crumble at the edges. The detail holds because it was never geometry to begin with. Apple has been hiring for this for years. Their SHARP model, published in research last year, generates photorealistic 3D scenes from a single image in under a second. Google has more sensor data than anyone. More Street View cars, more satellites, more capture history. On navigation accuracy and geodata depth, Google Maps is still ahead by most measures. But fidelity in 3D city rendering is a different competition, and Apple just set a bar in that. Most people will experience this in the fall without knowing the name of the technology. They will open Flyover, look at a city they know, and notice it looks different. Real, not rendered. That is the moment Gaussian Splatting stops being a research term and becomes something a billion people use. Bookmark this. It will look prescient by October.

Shruti

19,832 Aufrufe • vor 1 Monat

V3D Video Diffusion Models are Effective 3D Generators Automatic 3D generation has recently attracted widespread attention. Recent methods have greatly accelerated the generation speed, but usually produce less-detailed objects due to limited model capacity or 3D data. Motivated by recent advancements in video diffusion models, we introduce V3D, which leverages the world simulation capacity of pre-trained video diffusion models to facilitate 3D generation. To fully unleash the potential of video diffusion to perceive the 3D world, we further introduce geometrical consistency prior and extend the video diffusion model to a multi-view consistent 3D generator. Benefiting from this, the state-of-the-art video diffusion model could be fine-tuned to generate 360degree orbit frames surrounding an object given a single image. With our tailored reconstruction pipelines, we can generate high-quality meshes or 3D Gaussians within 3 minutes. Furthermore, our method can be extended to scene-level novel view synthesis, achieving precise control over the camera path with sparse input views. Extensive experiments demonstrate the superior performance of the proposed approach, especially in terms of generation quality and multi-view consistency

V3D Video Diffusion Models are Effective 3D Generators Automatic 3D generation has recently attracted widespread attention. Recent methods have greatly accelerated the generation speed, but usually produce less-detailed objects due to limited model capacity or 3D data. Motivated by recent advancements in video diffusion models, we introduce V3D, which leverages the world simulation capacity of pre-trained video diffusion models to facilitate 3D generation. To fully unleash the potential of video diffusion to perceive the 3D world, we further introduce geometrical consistency prior and extend the video diffusion model to a multi-view consistent 3D generator. Benefiting from this, the state-of-the-art video diffusion model could be fine-tuned to generate 360degree orbit frames surrounding an object given a single image. With our tailored reconstruction pipelines, we can generate high-quality meshes or 3D Gaussians within 3 minutes. Furthermore, our method can be extended to scene-level novel view synthesis, achieving precise control over the camera path with sparse input views. Extensive experiments demonstrate the superior performance of the proposed approach, especially in terms of generation quality and multi-view consistency

AK

31,997 Aufrufe • vor 2 Jahren

Apple Maps 3d is insanely impressive now (see video below) Google on the other hand decided to build an insanely clunky solution to display 3d maps to avoid using mobile GPUs: A Google server streams (!) a live video of its 3d rendered map to you, the result is a disastrous: a super laggy and clunky 3d view full of video compression artefacts with a response time of 2-3 seconds for every movement And I know WHY they built that, someone felt it was not inclusive to only have 3d maps that are high performing on iOS because Apple's GPUs are insanely fast and can deal with it (as you can see below in Apple Maps) but they wanted to support inferior phones, I get it But then they also decided to worsen the experience for iOS users who could easily run 3d natively on its GPU It's just an insane product decision, there's no way to cut it: they could have super smooth 3d like Apple Maps but neutered it with a clunky live stream Even more insane because Google does have native 3d on iOS in a completely different app that nobody uses: Google Earth Is 3d important? Not rly for nav no but it shows again something about Google's engineering decisions, like the redesign of the Google OAuth Login screen that made zero sense Nobody is speaking up against bad ideas in Google meetings, everyone's too scared of HR maybe, it's the opposite of a meritocracy and it's just super visible from using Google products right now

Apple Maps 3d is insanely impressive now (see video below) Google on the other hand decided to build an insanely clunky solution to display 3d maps to avoid using mobile GPUs: A Google server streams (!) a live video of its 3d rendered map to you, the result is a disastrous: a super laggy and clunky 3d view full of video compression artefacts with a response time of 2-3 seconds for every movement And I know WHY they built that, someone felt it was not inclusive to only have 3d maps that are high performing on iOS because Apple's GPUs are insanely fast and can deal with it (as you can see below in Apple Maps) but they wanted to support inferior phones, I get it But then they also decided to worsen the experience for iOS users who could easily run 3d natively on its GPU It's just an insane product decision, there's no way to cut it: they could have super smooth 3d like Apple Maps but neutered it with a clunky live stream Even more insane because Google does have native 3d on iOS in a completely different app that nobody uses: Google Earth Is 3d important? Not rly for nav no but it shows again something about Google's engineering decisions, like the redesign of the Google OAuth Login screen that made zero sense Nobody is speaking up against bad ideas in Google meetings, everyone's too scared of HR maybe, it's the opposite of a meritocracy and it's just super visible from using Google products right now

@levelsio

1,488,981 Aufrufe • vor 1 Jahr

3d visual positioning experiment -- look at the alignment between the 3d mesh and the live camera view. Truly feels magical -- like x-ray vision. Workflow: 1. Scanned a street in 15 mins w/ xgrids 2. Localized against that scan at night, in real-time, while sitting in a car The xgrids scanner (w/ rgb + lidar) is perfect to build maps for humans (3d gaussian splats). Then multiset ai makes it easy to build machine-readable maps for AR and robotics to figure out exactly where they are in 3d space with cm-level accuracy.

3d visual positioning experiment -- look at the alignment between the 3d mesh and the live camera view. Truly feels magical -- like x-ray vision. Workflow: 1. Scanned a street in 15 mins w/ xgrids 2. Localized against that scan at night, in real-time, while sitting in a car The xgrids scanner (w/ rgb + lidar) is perfect to build maps for humans (3d gaussian splats). Then multiset ai makes it easy to build machine-readable maps for AR and robotics to figure out exactly where they are in 3d space with cm-level accuracy.

Bilawal Sidhu

27,278 Aufrufe • vor 8 Monaten

3D capture is dope for delight, but the utility might be even more impactful. This digital twin of Boston fuses together aerial 3d scans + vector maps + building (BIM) data + historical crime data + utility waterlines + zoning data. You can answer questions like: 1. Which buildings were recently constructed in the Boston skyline? 2. How does crime density correlate with zoning types across the city? 3. Where are the oldest water mains located, and how do they relate to the city's infrastructure? 4. What would this busy street look like with dedicated biking lanes? It's wild the progress the geospatial world is making. And you can do almost all of this with Esri tools and a sprinkling of game engines.

3D capture is dope for delight, but the utility might be even more impactful. This digital twin of Boston fuses together aerial 3d scans + vector maps + building (BIM) data + historical crime data + utility waterlines + zoning data. You can answer questions like: 1. Which buildings were recently constructed in the Boston skyline? 2. How does crime density correlate with zoning types across the city? 3. Where are the oldest water mains located, and how do they relate to the city's infrastructure? 4. What would this busy street look like with dedicated biking lanes? It's wild the progress the geospatial world is making. And you can do almost all of this with Esri tools and a sprinkling of game engines.

Bilawal Sidhu

85,361 Aufrufe • vor 2 Jahren

Here's a look behind the scenes of how I used a 3d model, generated from an image, as a "driver" for ai animation. In the near future we'll see more 3d tools that support these types of workflows, with much more emphasis on enhancing user input instead of just letting the ai do everything. I used the 3d model to generate keyframes that I then animated in Luma Dream. By using a 3d model with a diffusion "layer" done with Krea, I got quite high quality frames with a relatively high consistency as well since the AI didn't have to "hallucinate" that much. I upscaled and detailed the frames using Magnific. Using semi-detailed 3d models as drivers for gen ai is very powerful, and it allows us to use models that are sculpted in a more free-flowing type of workflow that doesn't rely so much on high surface detailing or very time-consuming finish, but instead relies more on gestural 3d sculpting. #art #ai

Here's a look behind the scenes of how I used a 3d model, generated from an image, as a "driver" for ai animation. In the near future we'll see more 3d tools that support these types of workflows, with much more emphasis on enhancing user input instead of just letting the ai do everything. I used the 3d model to generate keyframes that I then animated in Luma Dream. By using a 3d model with a diffusion "layer" done with Krea, I got quite high quality frames with a relatively high consistency as well since the AI didn't have to "hallucinate" that much. I upscaled and detailed the frames using Magnific. Using semi-detailed 3d models as drivers for gen ai is very powerful, and it allows us to use models that are sculpted in a more free-flowing type of workflow that doesn't rely so much on high surface detailing or very time-consuming finish, but instead relies more on gestural 3d sculpting. #art #ai

Martin Nebelong

48,200 Aufrufe • vor 2 Jahren

MVDream: Multi-view Diffusion for 3D Generation paper page: propose MVDream, a multi-view diffusion model that is able to generate geometrically consistent multi-view images from a given text prompt. By leveraging image diffusion models pre-trained on large-scale web datasets and a multi-view dataset rendered from 3D assets, the resulting multi-view diffusion model can achieve both the generalizability of 2D diffusion and the consistency of 3D data. Such a model can thus be applied as a multi-view prior for 3D generation via Score Distillation Sampling, where it greatly improves the stability of existing 2D-lifting methods by solving the 3D consistency problem. Finally, we show that the multi-view diffusion model can also be fine-tuned under a few shot setting for personalized 3D generation, i.e. DreamBooth3D application, where the consistency can be maintained after learning the subject identity.

MVDream: Multi-view Diffusion for 3D Generation paper page: propose MVDream, a multi-view diffusion model that is able to generate geometrically consistent multi-view images from a given text prompt. By leveraging image diffusion models pre-trained on large-scale web datasets and a multi-view dataset rendered from 3D assets, the resulting multi-view diffusion model can achieve both the generalizability of 2D diffusion and the consistency of 3D data. Such a model can thus be applied as a multi-view prior for 3D generation via Score Distillation Sampling, where it greatly improves the stability of existing 2D-lifting methods by solving the 3D consistency problem. Finally, we show that the multi-view diffusion model can also be fine-tuned under a few shot setting for personalized 3D generation, i.e. DreamBooth3D application, where the consistency can be maintained after learning the subject identity.

AK

294,496 Aufrufe • vor 2 Jahren

Magic123: One Image to High-Quality 3D Object Generation Using Both 2D and 3D Diffusion Priors paper page: present Magic123, a two-stage coarse-to-fine approach for high-quality, textured 3D meshes generation from a single unposed image in the wild using both2D and 3D priors. In the first stage, we optimize a neural radiance field to produce a coarse geometry. In the second stage, we adopt a memory-efficient differentiable mesh representation to yield a high-resolution mesh with a visually appealing texture. In both stages, the 3D content is learned through reference view supervision and novel views guided by a combination of 2D and 3D diffusion priors. We introduce a single trade-off parameter between the 2D and 3D priors to control exploration (more imaginative) and exploitation (more precise) of the generated geometry. Additionally, we employ textual inversion and monocular depth regularization to encourage consistent appearances across views and to prevent degenerate solutions, respectively. Magic123 demonstrates a significant improvement over previous image-to-3D techniques, as validated through extensive experiments on synthetic benchmarks and diverse real-world images.

Magic123: One Image to High-Quality 3D Object Generation Using Both 2D and 3D Diffusion Priors paper page: present Magic123, a two-stage coarse-to-fine approach for high-quality, textured 3D meshes generation from a single unposed image in the wild using both2D and 3D priors. In the first stage, we optimize a neural radiance field to produce a coarse geometry. In the second stage, we adopt a memory-efficient differentiable mesh representation to yield a high-resolution mesh with a visually appealing texture. In both stages, the 3D content is learned through reference view supervision and novel views guided by a combination of 2D and 3D diffusion priors. We introduce a single trade-off parameter between the 2D and 3D priors to control exploration (more imaginative) and exploitation (more precise) of the generated geometry. Additionally, we employ textual inversion and monocular depth regularization to encourage consistent appearances across views and to prevent degenerate solutions, respectively. Magic123 demonstrates a significant improvement over previous image-to-3D techniques, as validated through extensive experiments on synthetic benchmarks and diverse real-world images.

AK

305,663 Aufrufe • vor 3 Jahren

People are just realizing something important. One of the largest real-world visual datasets for AI was built by millions of users… without them even knowing it. At Over the Reality, we believe this should work differently. We are building a global 3D map of the world through a community that maps real places on purpose, and gets rewarded for every meaningful contribution. People helping build the global 3D map of the world can earn from $1 to $3 for each location mapped with a smartphone, or up to $18 per hour using an Insta360 X5. Apply here:

People are just realizing something important. One of the largest real-world visual datasets for AI was built by millions of users… without them even knowing it. At Over the Reality, we believe this should work differently. We are building a global 3D map of the world through a community that maps real places on purpose, and gets rewarded for every meaningful contribution. People helping build the global 3D map of the world can earn from $1 to $3 for each location mapped with a smartphone, or up to $18 per hour using an Insta360 X5. Apply here:

Over the Reality 🌐

288,789 Aufrufe • vor 4 Monaten

The World Cup is pulling in millions of views right now. The question is: How can you use AI and YouTube Automation to benefit from all that attention? This girl sheres the step by step of how creators are using it to generate views and monetize content 👇

The World Cup is pulling in millions of views right now. The question is: How can you use AI and YouTube Automation to benefit from all that attention? This girl sheres the step by step of how creators are using it to generate views and monetize content 👇

UI x AI by Christabel

23,058 Aufrufe • vor 1 Monat

✨ Made a new mini feature on Photo AI: [ Grab from 3d model ] So the problem is we're at that stage in time (typical for AI) where image-to-3d models are not good enough but are fun to play with, but we know they'll be good enough in 1-2 years With [ Make 3d model ] you already can turn any Photo AI pic into a 3d model but it still looks hyper clunky and deformed, but it works! One cool idea I had to make that more useful and made now: Let people make a 3d model then change the view of the it with the 3d viewer, then press [ o ] and it grabs a frame of the 3d That image you can then [ Remix ] (img2img), and it becomes a real photo again and that in turn you can then turn into a video again with [ Make video ] So that essentially gives you a fully freeform camera position control to take photos with One thing I need to fix is the background/skybox, I kinda need to take the original photo and remove the person and just get the background for the 3d model viewer, in this case it should be white, but it's a start!

✨ Made a new mini feature on Photo AI: [ Grab from 3d model ] So the problem is we're at that stage in time (typical for AI) where image-to-3d models are not good enough but are fun to play with, but we know they'll be good enough in 1-2 years With [ Make 3d model ] you already can turn any Photo AI pic into a 3d model but it still looks hyper clunky and deformed, but it works! One cool idea I had to make that more useful and made now: Let people make a 3d model then change the view of the it with the 3d viewer, then press [ o ] and it grabs a frame of the 3d That image you can then [ Remix ] (img2img), and it becomes a real photo again and that in turn you can then turn into a video again with [ Make video ] So that essentially gives you a fully freeform camera position control to take photos with One thing I need to fix is the background/skybox, I kinda need to take the original photo and remove the person and just get the background for the 3d model viewer, in this case it should be white, but it's a start!

@levelsio

119,210 Aufrufe • vor 1 Jahr

Another city. Another full 3D Gaussian Splat reconstruction. This entire neighborhood was captured while walking with an Insta360 X5, then processed with our GPU pipeline and transformed into a high-fidelity, machine-readable 3D environment with Over the Reality. What you’re seeing is not an isolated virtual world. It’s a 3D digital twin of a real-world location, built from structured spatial data where every building, street, and detail can become usable by AI, robotics, and XR. And it’s happening at scale, across cities in more than 180 countries every day. Have an Insta360 X5 and want to become a mapper? Apply here: Want to build a 3D Gaussian Splat from a video you already recorded? Do it for free at powered by Over the Reality.

Another city. Another full 3D Gaussian Splat reconstruction. This entire neighborhood was captured while walking with an Insta360 X5, then processed with our GPU pipeline and transformed into a high-fidelity, machine-readable 3D environment with Over the Reality. What you’re seeing is not an isolated virtual world. It’s a 3D digital twin of a real-world location, built from structured spatial data where every building, street, and detail can become usable by AI, robotics, and XR. And it’s happening at scale, across cities in more than 180 countries every day. Have an Insta360 X5 and want to become a mapper? Apply here: Want to build a 3D Gaussian Splat from a video you already recorded? Do it for free at powered by Over the Reality.

Over the Reality 🌐

400,097 Aufrufe • vor 3 Monaten

Day 33 of vibe coding a game+engine using Cursor So I approached character animation from a different perspective this time. Instead of 3D -> 2D I went for native 2D via image to video method. This gives us much richer animations that the limited ones we get with 3D! Eliza is really cute with these! The downside is the time it takes to get them all done, even with Cursor skills/automation. Which ones did you like more? Check the previous posts for the 3D ones!

Day 33 of vibe coding a game+engine using Cursor So I approached character animation from a different perspective this time. Instead of 3D -> 2D I went for native 2D via image to video method. This gives us much richer animations that the limited ones we get with 3D! Eliza is really cute with these! The downside is the time it takes to get them all done, even with Cursor skills/automation. Which ones did you like more? Check the previous posts for the 3D ones!

Startracker 🔺

33,327 Aufrufe • vor 6 Monaten

Here is some new footage from this paper, offering a glimpse into the future of dynamic 3D Gaussian Splatting models combined with static reconstructed scenes. Imagine this: when the lighting matches, the result becomes practically indistinguishable from reality. Just pick a scene, add characters, and record it from any angle. Apply diffusion models to instantly change the look. I firmly believe this is the future of VFX.

Here is some new footage from this paper, offering a glimpse into the future of dynamic 3D Gaussian Splatting models combined with static reconstructed scenes. Imagine this: when the lighting matches, the result becomes practically indistinguishable from reality. Just pick a scene, add characters, and record it from any angle. Apply diffusion models to instantly change the look. I firmly believe this is the future of VFX.

MrNeRF

57,843 Aufrufe • vor 8 Monaten

✨ Introducing 3 new ways to build with real-world imagery and AI → From the fine details of a city street to a bird's eye view of the planet, at Cloud Next we revealed new ways that businesses can use our Street View, aerial and satellite imagery, and tap into our comprehensive view of the physical world including: 🏙️ Use Maps Imagery Grounding in Gemini Enterprise Agent Platform to generate location-specific assets at scale, anchored in real-world Street View imagery 🌍 Unlock a new perspective with Aerial and Satellite Insights, part of Earth AI, in BigQuery and Agent Platform to quickly extract detailed info about our changing planet 🛰️ Build custom apps with Aerial and Satellite models in Model Garden from Earth AI that easily identify specific objects or scenes Read the blog at the link above and sign up for early access.

✨ Introducing 3 new ways to build with real-world imagery and AI → From the fine details of a city street to a bird's eye view of the planet, at Cloud Next we revealed new ways that businesses can use our Street View, aerial and satellite imagery, and tap into our comprehensive view of the physical world including: 🏙️ Use Maps Imagery Grounding in Gemini Enterprise Agent Platform to generate location-specific assets at scale, anchored in real-world Street View imagery 🌍 Unlock a new perspective with Aerial and Satellite Insights, part of Earth AI, in BigQuery and Agent Platform to quickly extract detailed info about our changing planet 🛰️ Build custom apps with Aerial and Satellite models in Model Garden from Earth AI that easily identify specific objects or scenes Read the blog at the link above and sign up for early access.

Google Maps Platform

12,355 Aufrufe • vor 3 Monaten