
Bilawal Sidhu
@bilawalsidhu • 106,292 subscribers
Spatial intelligence. World models. Visual effects. Creator w/ 1.6M+ audience. Tech Curator @ TED. A16z Scout. Ex-Google PM (AR/VR & 3D Maps) https://t.co/fysPkbPoQ2
Shorts
Videos

God's eye view 24-hour replay of Operation Epic Fury. The Iran strikes kicked off and I set an AI agent swarm loose to record every OSINT signal I could find before the caches cleared. Built a full 4D reconstruction in WorldView. I can scrub through minute by minute and watch the whole thing unfold on a 3D globe: > Airspace clearing over Tehran > Ground strike coordinates locking in > Severe GPS interference blinding the region > EO and SAR satellites making passes over the strike zone > No-fly zones locking down 9 countries > Shipping fleets scrambling at the Strait of Hormuz It's pretty amazing how complete of a picture you can build without "proprietary data fusion" -- one dev with public signals and a love for computer graphics and geospatial intelligence. Thank you for all the love on my last post. Dropping WorldView in April. This my friends is just the beginning.
Bilawal Sidhu3,993,968 views • 3 months ago

Before/after of Corridor's latest AI video is wild. They shot video on greenscreen, made virtual sets in Unreal, then reskinned it to anime by finetuning Stable Diffusion. Net result? 120 VFX shots done by a team of 3 on a dime. Bravo! This is a milestone in creative technology🧵
Bilawal Sidhu13,177,834 views • 3 years ago

I made a 4D god's eye replay of the Iran strikes using public OSINT data. When I turned on the orbital layer in worldview something jumped out. You can see satellite passes stack up over the strike zones in the hours before & after impact. Everyone was watching. Some of them were overhead before it started. American KH-11s and TOPAZ SAR. Russian BARS-M and Persona. Chinese Gaofen optical and SAR. Maxar WorldView Legion. Airbus Pleiades. Capella. ICEYE. That's textbook behavior -- you collect right before for targeting, you strike, then you collect again for battle damage assessment. Just wild to see it all replayed in 3D like this. The commercial constellation density is also striking. What used to be exclusive nation state capability is now mirrored by half a dozen commercial operators. The intelligence monopoly is over.
Bilawal Sidhu717,537 views • 3 months ago

Google is now using Gemini to cross-reference ~250M places with Street View imagery to identify visible landmarks for turn-by-turn nav. Think iconic buildings, gas stations and restaurants. So instead of "turn right in 500 feet" you get "turn right after the Thai Siam Restaurant" with the landmark highlighted. AI solving the distance estimation problem by using what you can actually see. Rolling out in US.
Bilawal Sidhu962,854 views • 6 months ago

Okay it happened! Snapchat Spectacles AR glasses. Fully standalone. 46 degree field of view. 37 pixels per degree. That’s roughly like a 100” TV screen! 2x snapdragon chips. 45 minutes of battery. Auto transitioning lenses. Designed for co-presence. Spectator mode and more. Gotta hand it to Snap for pushing on this extremely hard engineering challenge, despite the rest of their peers going VR first as a stepping stone to this ultimate vision.
Bilawal Sidhu2,663,673 views • 1 year ago

Everything here is 100% generated w/ Google Veo 2. I've got early access, and the visual fidelity and prompt adherence is genuinely nuts. Let's test it together and have some fun. Drop your prompts below -- and for the next hour or so I'll reply with videos 👇
Bilawal Sidhu2,242,386 views • 1 year ago

This will change the way we experience sports forever -- watching the game from a gods eye view. Arcturus is building 4D gaussian splatting tech that can capture every angle of a sporting event and pushes the bar for volumetric video. I tested this in a headset and it makes 360 and 3D 180 video look ancient. This puts you closer to the action than any stadium seat, and it’s genuinely mind blowing.
Bilawal Sidhu480,118 views • 3 months ago

OpenClaw creator on Opus vs Codex: “Opus is like the coworker that is a little silly sometimes, but it's really funny and you keep him around. Codex is like the weirdo in the corner that you don't want to talk to, but he's reliable and gets shit done.” LMAO. Accurate.
Bilawal Sidhu429,619 views • 3 months ago

Semantically annotating 3D gaussian splats on the fly using gemini 3.1 + sparkjs 1. Load any 3D scene and hit scan 2. Get 2D detections from VLM 3. Cluster outputs & project into 3D world space 4. Save as a persistent 3D semantic layer Inspired by Alexander Chen's experiments with gemini visual intelligence. Just had to try to lift it from 2D to 3D!
Bilawal Sidhu55,921 views • 20 days ago

People are undoubtedly a little alarmed at having unwittingly helped build a 3D map of the world for Niantic by contributing 30 billion crowdsourced images. I interviewed Niantic's CTO Brian McClendon about exactly this in a TED interview last year -- he's also the guy who co-created Google Earth. But let's put it in perspective. Pokestop data isn't what you think it is. It's not a surveillance panopticon of your neighborhood. These are static captures of parks, statues, murals, landmarks -- the places people congregate. Brian described it as "building the map from the bottom up, from the locations where people spend time." Think of these 20 million waypoints as basically the inverse of what Google mapped with Street View. Google mapped the drivable streets. Niantic mapped where people actually hang out. Cool data, genuinely useful for visual positioning -- but very different from what the headlines imply. And lest we forget that Niantic is just one of many companies quietly building their own map of the world right now -- and they're all capturing different facets of reality: >🚶 person-level: Axon body cams on hundreds of thousands of officers. Meta Ray-Ban glasses capturing first-person POV at scale -- overseas operators reviewing images every time someone says "Hey Meta." > 🚗 vehicle-level: Tesla dashcams on every car in the fleet, massive onboard compute extracting and distilling data to the cloud. Waymo with cm-accurate 3D maps of every city they operate in. Fleet telematics cameras on delivery vehicles globally. > 🏠 street & home-level: Flock Safety deploying CCTV across neighborhoods and cities. Amazon with Ring cameras on every doorstep and mailroom (recently got dragged over that Super Bowl commercial about fusing all these cams together to find your dog) plus dashcams on every Prime delivery van. Roomba mapping your floor plan every time it vacuums -- Amazon wanted that data badly enough to try acquiring iRobot for $1.7B before regulators shut it down. > 🥽 headset-level: Apple Vision Pro and Meta Quest build a 3D model of whatever room you're in every time you put them on. Between Ring, Roomba, and your headset, your entire home is being spatially understood by at least three different companies. >📍platform-level: Google with Street View cars, aerial planes, satellite imagery, and live location from every Android phone in your pocket. Apple doing the same with mapping cars AND every LiDAR iPhone is quietly a 3D scanner. And yeah, despite the "Apple is too privacy-conscious" narrative, they're collecting location data too. >🏃 trajectory-level: Strava mapped every running and cycling trail on Earth -- and accidentally exposed secret military bases in Afghanistan and Syria because soldiers logged their jogs. When you aggregate enough individual trajectories, patterns emerge that were never supposed to be visible. > 🛰️ space-level: Planet Labs imaging the entire Earth's landmass every single day from orbit. Vantor capturing it in higher detail. Iceye doing it in 3D using SAR. If something changes anywhere on the planet -- a building goes up, a forest burns down, a military convoy moves -- before-and-after imagery within 24 hours. Fused together -- we have everything from body cam to dashcam to doorbell to phone to satellite -- every layer of physical reality is being mapped by somebody right now. Different sensors, different angles, different purposes. Same pattern. The interesting part is how they incentivize it. Google spends billions. Mapillary tried altruism. Hivemapper grinds with crypto. Pokémon GO cracked something none of them could: a game mechanic that subsidizes the scanning behavior. You're not building a map. You're catching pokemon. The map is just a side effect. 3D scanning is still a niche hobby for reality capture nerds like me. The moment somebody gamifies dense 3D capture at scale -- not posed photos but actual geometry -- that's when this blows wide open. Niantic sold the games for $3.5B but kept the spatial platform, with a data-sharing agreement in place. One team makes the game great, the other builds the spatial infrastructure underneath. Incentives finally aligned. Gaming is becoming a way for humans to contribute real-world trajectories that help physical AI learn about the real world. Google does it with live traffic. Tesla does it with autopilot. The mechanic is different but the pattern is identical -- and most people are already part of at least one -- if not a majority -- of these datasets whether they realize it or not.
Bilawal Sidhu203,106 views • 2 months ago

The internet going wild with the microwave AI filter -- prolly because it's pure nightmare fuel 😭
Bilawal Sidhu1,026,654 views • 1 year ago

Long live bullet time. The future of sports is 4d gaussian splatting.
Bilawal Sidhu126,174 views • 1 month ago

"When you ask your stoned roommates to put away the groceries" 😭
Bilawal Sidhu769,812 views • 1 year ago

Wow. Recreating the Shawshank Redemption prison in 3D from a single video, in real time (!) Just read the MASt3R-SLAM paper and it's pretty neat. These folks basically built a real-time dense SLAM system on top of MASt3R, which is a transformer-based neural network that can do 3d reconstruction and localization from uncalibrated image pairs. The cool part is they don't need a fixed camera model -- it just works with arbitrary cameras -- think different focal lengths, sensor sizes, even handling zooming in video (FMV drone video anyone?!). If you've done photogrammetry or played with NeRFs you know that is a HUGE deal. They've solved some tricky problems like efficient point matching and tracking, plus they've figured out how to fuse point clouds and handle loop closures in real-time. Their system runs at about 15 FPS on a 4090 and produces both camera poses and dense geometry. When they know the camera calibration, they get SOTA results across several benchmarks, but even without calibration, they still perform well. What's interesting is the approach -- most recent SLAM work has built on DROID-SLAM's architecture, but these folks went a different direction by leveraging a strong 3D reconstruction prior. Seems to give them more coherent geometry, which makes sense since that's what MASt3R was designed for. For anyone who cares about monocular SLAM and 3D reconstruction, this feels like a significant step toward plug-and-play dense SLAM without calibration headaches -- perfect for drones, robots, AR/VR -- the works!
Bilawal Sidhu703,649 views • 1 year ago

Visualizing the Strait of Hormuz shutdown using AIS tracking data. Wild to see the precipitous drop off in transits from 100's of vessels per day to a handful. The craziest part is zooming in to track a single Indian LPG tanker and watching it intentionally go dark -- turning off its AIS to sneak through the chokepoint, before popping right back on the radar on the other side. OSINT is too much fun. Pulling this together to a bigger 4D god's eye view of the gulf region.
Bilawal Sidhu95,870 views • 2 months ago