Bilawal Sidhu's banner

Bilawal Sidhu

@bilawalsidhu • 107,308 subscribers

Spatial intelligence. World models. Visual effects. Creator w/ 2.1M+ audience. Tech Curator @ TED. A16z Scout. Ex-Google PM (AR/VR & 3D Maps) https://t.co/fysPkbPoQ2

Shorts

peak internet: ai generated cctv footage of police arresting ppl for wearing huge boots video credit: u/Qemmish

peak internet: ai generated cctv footage of police arresting ppl for wearing huge boots video credit: u/Qemmish

25,871,934 просмотров

This is IronSight - a 4D reconstruction built by fusing footage from two pairs of meta ray-bans. A google research buddy saw the prototype and joked this would've been a siggraph paper a few years ago. Today it's a weekend build with fable. Stay tuned for more mad science. Happy 4th!

This is IronSight - a 4D reconstruction built by fusing footage from two pairs of meta ray-bans. A google research buddy saw the prototype and joked this would've been a siggraph paper a few years ago. Today it's a weekend build with fable. Stay tuned for more mad science. Happy 4th!

93,652 просмотров

This tech maps the physical world in 3d and snaps it perfectly to my camera feed at 60 fps. Visual Positioning Systems (VPS) are the under hyped backbone of spatial computing. This is how we connect the world of bits & atoms.

This tech maps the physical world in 3d and snaps it perfectly to my camera feed at 60 fps. Visual Positioning Systems (VPS) are the under hyped backbone of spatial computing. This is how we connect the world of bits & atoms.

92,770 просмотров

Nano Banana Pro is a really good cartographer. Used it to turn low res satellite imagery into a detailed hand drawn map and vector HD map. Pretty wild how well it segments everything and even recovers paths/roads hidden under tree cover. Looks way more detailed than the current google basemap which is pretty sparse in countries like India. Included both in video for comparison.

Nano Banana Pro is a really good cartographer. Used it to turn low res satellite imagery into a detailed hand drawn map and vector HD map. Pretty wild how well it segments everything and even recovers paths/roads hidden under tree cover. Looks way more detailed than the current google basemap which is pretty sparse in countries like India. Included both in video for comparison.

554,024 просмотров

OpenAI just dropped their Sora research paper. As expected, the video-to-video results are flipping spectacular 🪄 A few other gems:

OpenAI just dropped their Sora research paper. As expected, the video-to-video results are flipping spectacular 🪄 A few other gems:

1,873,673 просмотров

Damn it worked! Genie 3 world --> inpaint UI --> 4x topaz AI upscale --> train 3d gaussian splat You can step inside a painting of Socrates from 1787. Better than any image-to-3d model I've seen. I think Google has stumbled upon the killer app for VR -- the literal holodeck.

Damn it worked! Genie 3 world --> inpaint UI --> 4x topaz AI upscale --> train 3d gaussian splat You can step inside a painting of Socrates from 1787. Better than any image-to-3d model I've seen. I think Google has stumbled upon the killer app for VR -- the literal holodeck.

657,658 просмотров

Generative AI is super cool… BUT I’m continually blown away with the work happening in ‘procedural’ 3D modelling. Especially given these plugins are for a free (!) 3D tool like Blender. We needed a fancy Houdini license to do this just a few years ago 🤯

Generative AI is super cool… BUT I’m continually blown away with the work happening in ‘procedural’ 3D modelling. Especially given these plugins are for a free (!) 3D tool like Blender. We needed a fancy Houdini license to do this just a few years ago 🤯

1,365,752 просмотров

Loving how this turned out! IronSight turns Meta Ray-Ban clips (from the range) into 4D reconstructions you can replay from any angle -- including an AR view that sees targets straight through walls. It 3D tracks both runs, auto locates every target, and scores hits vs misses using audio cues + Gemini for multimodal reasoning. Full breakdown coming to the channel. The test below is where this started, and then Fable showed up and I blitzed through my whole roadmap in a few days.

Loving how this turned out! IronSight turns Meta Ray-Ban clips (from the range) into 4D reconstructions you can replay from any angle -- including an AR view that sees targets straight through walls. It 3D tracks both runs, auto locates every target, and scores hits vs misses using audio cues + Gemini for multimodal reasoning. Full breakdown coming to the channel. The test below is where this started, and then Fable showed up and I blitzed through my whole roadmap in a few days.

26,116 просмотров

One of the wildest emergent capabilities of Genie 3 is that maps actually work. As I walk around the forest, the GPS display updates its heading in real time. Remember. There is no game engine here. This is an AI hallucinating a working navigational instrument purely from next frame prediction. 🤯

One of the wildest emergent capabilities of Genie 3 is that maps actually work. As I walk around the forest, the GPS display updates its heading in real time. Remember. There is no game engine here. This is an AI hallucinating a working navigational instrument purely from next frame prediction. 🤯

247,037 просмотров

I love 3d maps like this because they’re inherently an abstraction of reality. Unlike 3d scans, the goal isn’t to create a 1:1 mirror world. The goal is to create a stylized distillation down to its visual essence, so you can recognize it effortlessly.

I love 3d maps like this because they’re inherently an abstraction of reality. Unlike 3d scans, the goal isn’t to create a 1:1 mirror world. The goal is to create a stylized distillation down to its visual essence, so you can recognize it effortlessly.

249,351 просмотров

Lmao. What niche even is this — grassroots dirt track racing meets google maps nerds? Veo 3 videos are seriously ridiculous and fun. Turn audio on for max enjoyment.

Lmao. What niche even is this — grassroots dirt track racing meets google maps nerds? Veo 3 videos are seriously ridiculous and fun. Turn audio on for max enjoyment.

452,046 просмотров

The lines between code & content are blurring. I made this 3d city block animation in claude 3.7. Then I used runway gen-3's video-to-video to style it like a lego city at night. At the rate things are going, this'll be a shader running in real-time.

The lines between code & content are blurring. I made this 3d city block animation in claude 3.7. Then I used runway gen-3's video-to-video to style it like a lego city at night. At the rate things are going, this'll be a shader running in real-time.

487,553 просмотров

Generative AI is cool and all, but procedural 3D modelling continues to hit the spot

Generative AI is cool and all, but procedural 3D modelling continues to hit the spot

500,696 просмотров

AI stitching together multiple video feeds into one omniscient traffic god. This is what happens when cameras start talking to each other -- mapping the trajectory of every vehicle and pedestrian seamlessly across cameras. Spatial intelligence is coming to a city near you.

AI stitching together multiple video feeds into one omniscient traffic god. This is what happens when cameras start talking to each other -- mapping the trajectory of every vehicle and pedestrian seamlessly across cameras. Spatial intelligence is coming to a city near you.

328,202 просмотров

Nano is a depth-aware atmospheric haze plugin that uses ML depth estimation to add physically accurate fog and light scattering to your footage. Works *best* on log footage with visible light sources - it analyzes scene highlights then creates airlight (atmospheric scatter) and halation (light bloom) that responds to actual depth in the scene. Pretty clever approach to getting that cinematic haze look without having to pump a fog machine on set. Makes the OG Trapcode Shine look extremely dated (basically 2D light streaks masked by luminance values), and is yet way more controllable than the current crop of generative AI video-to-video tools.

Nano is a depth-aware atmospheric haze plugin that uses ML depth estimation to add physically accurate fog and light scattering to your footage. Works best on log footage with visible light sources - it analyzes scene highlights then creates airlight (atmospheric scatter) and halation (light bloom) that responds to actual depth in the scene. Pretty clever approach to getting that cinematic haze look without having to pump a fog machine on set. Makes the OG Trapcode Shine look extremely dated (basically 2D light streaks masked by luminance values), and is yet way more controllable than the current crop of generative AI video-to-video tools.

275,292 просмотров

Generative AI is really cool but sometimes you want to drift your car through an intersection in a super specific way. Draw a spline then get 3D onion skinning so you can adjust the curves with a clear spatial reference. iCars plugin for Blender:

Generative AI is really cool but sometimes you want to drift your car through an intersection in a super specific way. Draw a spline then get 3D onion skinning so you can adjust the curves with a clear spatial reference. iCars plugin for Blender:

183,667 просмотров

5 min video from an insta360 camera turned into a big ass 3d gaussian splat using the new niantic scaniverse app it’s kinda wild how well we can model the complexity of reality, and run in realtime in a browser at 100fps

5 min video from an insta360 camera turned into a big ass 3d gaussian splat using the new niantic scaniverse app it’s kinda wild how well we can model the complexity of reality, and run in realtime in a browser at 100fps

62,824 просмотров

Heads up! Mosaic dropped a pretty wild dataset of 1.26 million 360° images of Prague 🤯 If you're a researcher, creator or developer into 3D/AI/Geo, I think you're gonna wanna play with this Here's the scoop on this 15 TERAPIXEL dataset & the crazy things you can do with it 🧵

Heads up! Mosaic dropped a pretty wild dataset of 1.26 million 360° images of Prague 🤯 If you're a researcher, creator or developer into 3D/AI/Geo, I think you're gonna wanna play with this Here's the scoop on this 15 TERAPIXEL dataset & the crazy things you can do with it 🧵

295,800 просмотров

Google just took a big step towards building ChatGPT for Earth. AlphaEarth Foundations does something clever -- instead of drowning in petabytes of Earth observation data, it creates compact summaries of every 10x10m square on Earth by fusing optical, radar, LiDAR, and climate data. The kicker is it can see through clouds in Ecuador and reveal hidden agricultural patterns in Canada. MapBiomas and Global Ecosystems Atlas already using it for conservation work.

Google just took a big step towards building ChatGPT for Earth. AlphaEarth Foundations does something clever -- instead of drowning in petabytes of Earth observation data, it creates compact summaries of every 10x10m square on Earth by fusing optical, radar, LiDAR, and climate data. The kicker is it can see through clouds in Ecuador and reveal hidden agricultural patterns in Canada. MapBiomas and Global Ecosystems Atlas already using it for conservation work.

165,867 просмотров

I can no longer walk around a city without seeing a machine-readable 3D model in my head. This is a geometric and semantic 3D model of San Francisco. These maps connect the world of bits and atoms, enabling visual positioning, 3D navigation and geospatial intelligence.

I can no longer walk around a city without seeing a machine-readable 3D model in my head. This is a geometric and semantic 3D model of San Francisco. These maps connect the world of bits and atoms, enabling visual positioning, 3D navigation and geospatial intelligence.

260,195 просмотров

Videos

Anya Rossi

sweetdream.ai

SweetDream.ai•Sponsored•Livecam

Watch Anya Live

Anya is streaming live right now! Join her private show and enjoy exclusive content.

Exclusive private shows

1.2k viewers online

Private Show

Join now for exclusive access

Free preview available • Premium content

holy crap! apple just beat google to the punch -- 3d gaussian splatting is coming to apple maps. these 3d scenes are made from oblique aerial imagery. but unlike blobby photogrammetry -- no more broccoli trees, no more melted powerlines -- ground level detail that actually holds up. here's hoping google maps/earth follows suite soon -- they have a significantly larger corpus of sensor data to work with. time to splat the world!

holy crap! apple just beat google to the punch -- 3d gaussian splatting is coming to apple maps. these 3d scenes are made from oblique aerial imagery. but unlike blobby photogrammetry -- no more broccoli trees, no more melted powerlines -- ground level detail that actually holds up. here's hoping google maps/earth follows suite soon -- they have a significantly larger corpus of sensor data to work with. time to splat the world!

1,090,688 просмотров • 1 месяц назад

God's eye view 24-hour replay of Operation Epic Fury. The Iran strikes kicked off and I set an AI agent swarm loose to record every OSINT signal I could find before the caches cleared. Built a full 4D reconstruction in WorldView. I can scrub through minute by minute and watch the whole thing unfold on a 3D globe: > Airspace clearing over Tehran > Ground strike coordinates locking in > Severe GPS interference blinding the region > EO and SAR satellites making passes over the strike zone > No-fly zones locking down 9 countries > Shipping fleets scrambling at the Strait of Hormuz It's pretty amazing how complete of a picture you can build without "proprietary data fusion" -- one dev with public signals and a love for computer graphics and geospatial intelligence. Thank you for all the love on my last post. Dropping WorldView in April. This my friends is just the beginning.

God's eye view 24-hour replay of Operation Epic Fury. The Iran strikes kicked off and I set an AI agent swarm loose to record every OSINT signal I could find before the caches cleared. Built a full 4D reconstruction in WorldView. I can scrub through minute by minute and watch the whole thing unfold on a 3D globe: > Airspace clearing over Tehran > Ground strike coordinates locking in > Severe GPS interference blinding the region > EO and SAR satellites making passes over the strike zone > No-fly zones locking down 9 countries > Shipping fleets scrambling at the Strait of Hormuz It's pretty amazing how complete of a picture you can build without "proprietary data fusion" -- one dev with public signals and a love for computer graphics and geospatial intelligence. Thank you for all the love on my last post. Dropping WorldView in April. This my friends is just the beginning.

3,999,520 просмотров • 4 месяцев назад

OG Anunoby is too sick. Here is the full point-of-view 3d reconstruction of his winning tip-in from the Knicks game last night. You can literally relive it from his perspective. Built with viewpoint pro using stadium tracking cameras and unreal engine.

OG Anunoby is too sick. Here is the full point-of-view 3d reconstruction of his winning tip-in from the Knicks game last night. You can literally relive it from his perspective. Built with viewpoint pro using stadium tracking cameras and unreal engine.

716,890 просмотров • 1 месяц назад

Just used claude fable (aka mythos) to create this city block simulator complete with multi-agent traffic, live detection boxes + tracks, and day to night cycle. And it just one shotted it. This is gonna be fun -- the gap between idea and execution just keeps collapsing.

Just used claude fable (aka mythos) to create this city block simulator complete with multi-agent traffic, live detection boxes + tracks, and day to night cycle. And it just one shotted it. This is gonna be fun -- the gap between idea and execution just keeps collapsing.

372,058 просмотров • 1 месяц назад

Before/after of Corridor's latest AI video is wild. They shot video on greenscreen, made virtual sets in Unreal, then reskinned it to anime by finetuning Stable Diffusion. Net result? 120 VFX shots done by a team of 3 on a dime. Bravo! This is a milestone in creative technology🧵

Before/after of Corridor's latest AI video is wild. They shot video on greenscreen, made virtual sets in Unreal, then reskinned it to anime by finetuning Stable Diffusion. Net result? 120 VFX shots done by a team of 3 on a dime. Bravo! This is a milestone in creative technology🧵

13,178,924 просмотров • 3 лет назад

bro putting a belt fed machine gun on a monitor arm is probably the most america thing i've seen all week. shout out eric pettway.

bro putting a belt fed machine gun on a monitor arm is probably the most america thing i've seen all week. shout out eric pettway.

416,360 просмотров • 1 месяц назад

I made a 4D god's eye replay of the Iran strikes using public OSINT data. When I turned on the orbital layer in worldview something jumped out. You can see satellite passes stack up over the strike zones in the hours before & after impact. Everyone was watching. Some of them were overhead before it started. American KH-11s and TOPAZ SAR. Russian BARS-M and Persona. Chinese Gaofen optical and SAR. Maxar WorldView Legion. Airbus Pleiades. Capella. ICEYE. That's textbook behavior -- you collect right before for targeting, you strike, then you collect again for battle damage assessment. Just wild to see it all replayed in 3D like this. The commercial constellation density is also striking. What used to be exclusive nation state capability is now mirrored by half a dozen commercial operators. The intelligence monopoly is over.

I made a 4D god's eye replay of the Iran strikes using public OSINT data. When I turned on the orbital layer in worldview something jumped out. You can see satellite passes stack up over the strike zones in the hours before & after impact. Everyone was watching. Some of them were overhead before it started. American KH-11s and TOPAZ SAR. Russian BARS-M and Persona. Chinese Gaofen optical and SAR. Maxar WorldView Legion. Airbus Pleiades. Capella. ICEYE. That's textbook behavior -- you collect right before for targeting, you strike, then you collect again for battle damage assessment. Just wild to see it all replayed in 3D like this. The commercial constellation density is also striking. What used to be exclusive nation state capability is now mirrored by half a dozen commercial operators. The intelligence monopoly is over.

718,701 просмотров • 4 месяцев назад

Okay it happened! Snapchat Spectacles AR glasses. Fully standalone. 46 degree field of view. 37 pixels per degree. That’s roughly like a 100” TV screen! 2x snapdragon chips. 45 minutes of battery. Auto transitioning lenses. Designed for co-presence. Spectator mode and more. Gotta hand it to Snap for pushing on this extremely hard engineering challenge, despite the rest of their peers going VR first as a stepping stone to this ultimate vision.

Okay it happened! Snapchat Spectacles AR glasses. Fully standalone. 46 degree field of view. 37 pixels per degree. That’s roughly like a 100” TV screen! 2x snapdragon chips. 45 minutes of battery. Auto transitioning lenses. Designed for co-presence. Spectator mode and more. Gotta hand it to Snap for pushing on this extremely hard engineering challenge, despite the rest of their peers going VR first as a stepping stone to this ultimate vision.

2,664,068 просмотров • 1 год назад

Everything here is 100% generated w/ Google Veo 2. I've got early access, and the visual fidelity and prompt adherence is genuinely nuts. Let's test it together and have some fun. Drop your prompts below -- and for the next hour or so I'll reply with videos 👇

Everything here is 100% generated w/ Google Veo 2. I've got early access, and the visual fidelity and prompt adherence is genuinely nuts. Let's test it together and have some fun. Drop your prompts below -- and for the next hour or so I'll reply with videos 👇

2,242,545 просмотров • 1 год назад

Google is now using Gemini to cross-reference ~250M places with Street View imagery to identify visible landmarks for turn-by-turn nav. Think iconic buildings, gas stations and restaurants. So instead of "turn right in 500 feet" you get "turn right after the Thai Siam Restaurant" with the landmark highlighted. AI solving the distance estimation problem by using what you can actually see. Rolling out in US.

Google is now using Gemini to cross-reference ~250M places with Street View imagery to identify visible landmarks for turn-by-turn nav. Think iconic buildings, gas stations and restaurants. So instead of "turn right in 500 feet" you get "turn right after the Thai Siam Restaurant" with the landmark highlighted. AI solving the distance estimation problem by using what you can actually see. Rolling out in US.

962,971 просмотров • 8 месяцев назад

Hollywood imagined a camera that could watch over an entire city. The military actually built it. There's a secret camera that can literally rewind time over an entire city. It captures 30 square miles at once. Every car, every person, every movement is recorded, indexed, and completely searchable. If an attack happens, operators just hit "rewind" to track the suspects back to where they came from. Turns out Enemy of the State wasn't fiction. It was a technical roadmap. And the folks that worked on the movie's visual effects actually helped build the real system. This is the story of WAMI -- the ultimate god's eye view, and how a spy-thriller movie trope became reality. I spent weeks breaking down how it works and the AI infused upgrade taking it to the next level. 00:00 What Is WAMI? 02:27 The Hollywood Origin Story 06:19 Baltimore's Secret Spy Planes 10:18 The Shadow Air Force 12:47 AI Automated Mass Tracking 15:45 Ukraine's Battlefield AI 17:48 Live 3D Scene Reconstruction 19:38 The Living Digital Twin

Hollywood imagined a camera that could watch over an entire city. The military actually built it. There's a secret camera that can literally rewind time over an entire city. It captures 30 square miles at once. Every car, every person, every movement is recorded, indexed, and completely searchable. If an attack happens, operators just hit "rewind" to track the suspects back to where they came from. Turns out Enemy of the State wasn't fiction. It was a technical roadmap. And the folks that worked on the movie's visual effects actually helped build the real system. This is the story of WAMI -- the ultimate god's eye view, and how a spy-thriller movie trope became reality. I spent weeks breaking down how it works and the AI infused upgrade taking it to the next level. 00:00 What Is WAMI? 02:27 The Hollywood Origin Story 06:19 Baltimore's Secret Spy Planes 10:18 The Shadow Air Force 12:47 AI Automated Mass Tracking 15:45 Ukraine's Battlefield AI 17:48 Live 3D Scene Reconstruction 19:38 The Living Digital Twin

63,470 просмотров • 18 дней назад

This will change the way we experience sports forever -- watching the game from a gods eye view. Arcturus is building 4D gaussian splatting tech that can capture every angle of a sporting event and pushes the bar for volumetric video. I tested this in a headset and it makes 360 and 3D 180 video look ancient. This puts you closer to the action than any stadium seat, and it’s genuinely mind blowing.

This will change the way we experience sports forever -- watching the game from a gods eye view. Arcturus is building 4D gaussian splatting tech that can capture every angle of a sporting event and pushes the bar for volumetric video. I tested this in a headset and it makes 360 and 3D 180 video look ancient. This puts you closer to the action than any stadium seat, and it’s genuinely mind blowing.

501,441 просмотров • 5 месяцев назад

OpenClaw creator on Opus vs Codex: “Opus is like the coworker that is a little silly sometimes, but it's really funny and you keep him around. Codex is like the weirdo in the corner that you don't want to talk to, but he's reliable and gets shit done.” LMAO. Accurate.

OpenClaw creator on Opus vs Codex: “Opus is like the coworker that is a little silly sometimes, but it's really funny and you keep him around. Codex is like the weirdo in the corner that you don't want to talk to, but he's reliable and gets shit done.” LMAO. Accurate.

429,825 просмотров • 5 месяцев назад

Is the Strait of Hormuz open or closed? Came up over lunch so I voice noted my agent to "boot up god's eye view and check." It sent me back this timelapse clip -- refreshed the AIS data, checked oil futures, mapped 120 days of vessel transits and rendered it out. TL;DR it's kinda open. Traffic is moving again, just not back to pre-conflict normal. Feels like I have a geopolitical analyst on demand. We're already living in the future my friends!

Is the Strait of Hormuz open or closed? Came up over lunch so I voice noted my agent to "boot up god's eye view and check." It sent me back this timelapse clip -- refreshed the AIS data, checked oil futures, mapped 120 days of vessel transits and rendered it out. TL;DR it's kinda open. Traffic is moving again, just not back to pre-conflict normal. Feels like I have a geopolitical analyst on demand. We're already living in the future my friends!

68,522 просмотров • 27 дней назад

Veo 3 has digested the mother lode of ASMR content on YouTube making it an AI ASMR machine. This one got 3.1M likes and 12k comments in 3 days. Every popular YouTube format is about to get its impossible AI remix.

Veo 3 has digested the mother lode of ASMR content on YouTube making it an AI ASMR machine. This one got 3.1M likes and 12k comments in 3 days. Every popular YouTube format is about to get its impossible AI remix.

853,652 просмотров • 1 год назад

The internet going wild with the microwave AI filter -- prolly because it's pure nightmare fuel 😭

The internet going wild with the microwave AI filter -- prolly because it's pure nightmare fuel 😭

1,026,814 просмотров • 1 год назад

RIP truth - it's been nice knowing ya

RIP truth - it's been nice knowing ya

1,402,768 просмотров • 2 лет назад

Generative ai is cool but procedural 3d generation just hits different. iCity basically feels like sim city for adults who know blender.

Generative ai is cool but procedural 3d generation just hits different. iCity basically feels like sim city for adults who know blender.

533,170 просмотров • 9 месяцев назад

"When you ask your stoned roommates to put away the groceries" 😭

"When you ask your stoned roommates to put away the groceries" 😭

769,860 просмотров • 1 год назад

People are undoubtedly a little alarmed at having unwittingly helped build a 3D map of the world for Niantic by contributing 30 billion crowdsourced images. I interviewed Niantic's CTO Brian McClendon about exactly this in a TED interview last year -- he's also the guy who co-created Google Earth. But let's put it in perspective. Pokestop data isn't what you think it is. It's not a surveillance panopticon of your neighborhood. These are static captures of parks, statues, murals, landmarks -- the places people congregate. Brian described it as "building the map from the bottom up, from the locations where people spend time." Think of these 20 million waypoints as basically the inverse of what Google mapped with Street View. Google mapped the drivable streets. Niantic mapped where people actually hang out. Cool data, genuinely useful for visual positioning -- but very different from what the headlines imply. And lest we forget that Niantic is just one of many companies quietly building their own map of the world right now -- and they're all capturing different facets of reality: >🚶 person-level: Axon body cams on hundreds of thousands of officers. Meta Ray-Ban glasses capturing first-person POV at scale -- overseas operators reviewing images every time someone says "Hey Meta." > 🚗 vehicle-level: Tesla dashcams on every car in the fleet, massive onboard compute extracting and distilling data to the cloud. Waymo with cm-accurate 3D maps of every city they operate in. Fleet telematics cameras on delivery vehicles globally. > 🏠 street & home-level: Flock Safety deploying CCTV across neighborhoods and cities. Amazon with Ring cameras on every doorstep and mailroom (recently got dragged over that Super Bowl commercial about fusing all these cams together to find your dog) plus dashcams on every Prime delivery van. Roomba mapping your floor plan every time it vacuums -- Amazon wanted that data badly enough to try acquiring iRobot for $1.7B before regulators shut it down. > 🥽 headset-level: Apple Vision Pro and Meta Quest build a 3D model of whatever room you're in every time you put them on. Between Ring, Roomba, and your headset, your entire home is being spatially understood by at least three different companies. >📍platform-level: Google with Street View cars, aerial planes, satellite imagery, and live location from every Android phone in your pocket. Apple doing the same with mapping cars AND every LiDAR iPhone is quietly a 3D scanner. And yeah, despite the "Apple is too privacy-conscious" narrative, they're collecting location data too. >🏃 trajectory-level: Strava mapped every running and cycling trail on Earth -- and accidentally exposed secret military bases in Afghanistan and Syria because soldiers logged their jogs. When you aggregate enough individual trajectories, patterns emerge that were never supposed to be visible. > 🛰️ space-level: Planet Labs imaging the entire Earth's landmass every single day from orbit. Vantor capturing it in higher detail. Iceye doing it in 3D using SAR. If something changes anywhere on the planet -- a building goes up, a forest burns down, a military convoy moves -- before-and-after imagery within 24 hours. Fused together -- we have everything from body cam to dashcam to doorbell to phone to satellite -- every layer of physical reality is being mapped by somebody right now. Different sensors, different angles, different purposes. Same pattern. The interesting part is how they incentivize it. Google spends billions. Mapillary tried altruism. Hivemapper grinds with crypto. Pokémon GO cracked something none of them could: a game mechanic that subsidizes the scanning behavior. You're not building a map. You're catching pokemon. The map is just a side effect. 3D scanning is still a niche hobby for reality capture nerds like me. The moment somebody gamifies dense 3D capture at scale -- not posed photos but actual geometry -- that's when this blows wide open. Niantic sold the games for $3.5B but kept the spatial platform, with a data-sharing agreement in place. One team makes the game great, the other builds the spatial infrastructure underneath. Incentives finally aligned. Gaming is becoming a way for humans to contribute real-world trajectories that help physical AI learn about the real world. Google does it with live traffic. Tesla does it with autopilot. The mechanic is different but the pattern is identical -- and most people are already part of at least one -- if not a majority -- of these datasets whether they realize it or not.

People are undoubtedly a little alarmed at having unwittingly helped build a 3D map of the world for Niantic by contributing 30 billion crowdsourced images. I interviewed Niantic's CTO Brian McClendon about exactly this in a TED interview last year -- he's also the guy who co-created Google Earth. But let's put it in perspective. Pokestop data isn't what you think it is. It's not a surveillance panopticon of your neighborhood. These are static captures of parks, statues, murals, landmarks -- the places people congregate. Brian described it as "building the map from the bottom up, from the locations where people spend time." Think of these 20 million waypoints as basically the inverse of what Google mapped with Street View. Google mapped the drivable streets. Niantic mapped where people actually hang out. Cool data, genuinely useful for visual positioning -- but very different from what the headlines imply. And lest we forget that Niantic is just one of many companies quietly building their own map of the world right now -- and they're all capturing different facets of reality: >🚶 person-level: Axon body cams on hundreds of thousands of officers. Meta Ray-Ban glasses capturing first-person POV at scale -- overseas operators reviewing images every time someone says "Hey Meta." > 🚗 vehicle-level: Tesla dashcams on every car in the fleet, massive onboard compute extracting and distilling data to the cloud. Waymo with cm-accurate 3D maps of every city they operate in. Fleet telematics cameras on delivery vehicles globally. > 🏠 street & home-level: Flock Safety deploying CCTV across neighborhoods and cities. Amazon with Ring cameras on every doorstep and mailroom (recently got dragged over that Super Bowl commercial about fusing all these cams together to find your dog) plus dashcams on every Prime delivery van. Roomba mapping your floor plan every time it vacuums -- Amazon wanted that data badly enough to try acquiring iRobot for $1.7B before regulators shut it down. > 🥽 headset-level: Apple Vision Pro and Meta Quest build a 3D model of whatever room you're in every time you put them on. Between Ring, Roomba, and your headset, your entire home is being spatially understood by at least three different companies. >📍platform-level: Google with Street View cars, aerial planes, satellite imagery, and live location from every Android phone in your pocket. Apple doing the same with mapping cars AND every LiDAR iPhone is quietly a 3D scanner. And yeah, despite the "Apple is too privacy-conscious" narrative, they're collecting location data too. >🏃 trajectory-level: Strava mapped every running and cycling trail on Earth -- and accidentally exposed secret military bases in Afghanistan and Syria because soldiers logged their jogs. When you aggregate enough individual trajectories, patterns emerge that were never supposed to be visible. > 🛰️ space-level: Planet Labs imaging the entire Earth's landmass every single day from orbit. Vantor capturing it in higher detail. Iceye doing it in 3D using SAR. If something changes anywhere on the planet -- a building goes up, a forest burns down, a military convoy moves -- before-and-after imagery within 24 hours. Fused together -- we have everything from body cam to dashcam to doorbell to phone to satellite -- every layer of physical reality is being mapped by somebody right now. Different sensors, different angles, different purposes. Same pattern. The interesting part is how they incentivize it. Google spends billions. Mapillary tried altruism. Hivemapper grinds with crypto. Pokémon GO cracked something none of them could: a game mechanic that subsidizes the scanning behavior. You're not building a map. You're catching pokemon. The map is just a side effect. 3D scanning is still a niche hobby for reality capture nerds like me. The moment somebody gamifies dense 3D capture at scale -- not posed photos but actual geometry -- that's when this blows wide open. Niantic sold the games for $3.5B but kept the spatial platform, with a data-sharing agreement in place. One team makes the game great, the other builds the spatial infrastructure underneath. Incentives finally aligned. Gaming is becoming a way for humans to contribute real-world trajectories that help physical AI learn about the real world. Google does it with live traffic. Tesla does it with autopilot. The mechanic is different but the pattern is identical -- and most people are already part of at least one -- if not a majority -- of these datasets whether they realize it or not.

203,550 просмотров • 4 месяцев назад