
Felix Heide
@_FelixHeide_ • 1,949 subscribers
Princeton Computational Imaging Lab: https://t.co/n8gRRpdvr4 Head of AI at Torc Robotics: https://t.co/7RonQDi1MJ
Shorts
Videos

ScenarioControl 🚗🛣️ - Scenario Generation from a single Dashcam Image 📸 or Text Prompt 💬!! Excited to introduce a new vision-language control mechanism for learned driving scenario generation. Given a single dashcam image or a scene prompt or an image, we generate a full scene layout 🧩, temporally consistent rollouts, including map 🗺️, agents 🚗, and ego video🛣️ ScenarioControl enables direct, fine-grained control over layout and traffic while preserving realism. It operates in a vectorized latent space with a new cross-global control mechanism to fuse vision-language inputs with scene structure while preserving realism. Interfaces seamlessly with generative video models! Project: Super fun project by Lili Gao, Yanbo Xu , William Koch, Samuele Ruffino, Luke Rowe , Behdad Chalaki, Dmitriy Rivkin, Julian Ost, Roger Girgis, Mario Bijelic.
Felix Heide22,238 Aufrufe • vor 1 Monat

WorldFlow3D: Unbounded 3D World Generation 🌍 by Flow Through Hierarchical Distributions, without VAEs ! We reformulate 3D generation as flowing through sequentially finer 3D distributions, cutting training time by more than half ⏱️ compared to existing approaches! Vectorized map layouts provide full scene controllability 🗺️, and a novel flow-field alignment process enables causally coherent, spatially unbounded generation 🌍. This generative method generalizes across both real and synthetic data distributions! Project: Project led by Amogh Joshi and Julian Ost — will be super fun to build on this! 🔥
Felix Heide19,441 Aufrufe • vor 1 Monat

Splines instead of Gaussians 😉 Introducing Neural Spline Fields, which can see through occlusions! We learn to represent a stack of misaligned captures as a multi-layer image sandwich. Then you can extract your favorite layer to remove occlusions, reflections, or even your own shadows from the scene! Paper and Code: Fun work by Ilya Chugunov , David Shustin , Ruyu Yan, and Chenyang Lei.
Felix Heide137,270 Aufrufe • vor 2 Jahren

Excited to share our #NeurIPS2025 work on learning motion hierarchies! We introduce a general hierarchical graph learning method that learns structured, interpretable motion directly from data, no prior structure or assumptions needed!!! Project and Paper: Amazing work led by William Koch, Cheng Zheng, and Baiang Li ! See us in San Diego for #NeurIPS2025!
Felix Heide25,308 Aufrufe • vor 6 Monaten

Starting the new year without human labeling 🎉!! Multimodal lidar-camera data is a gold mine of dense 3D geometry hiding in plain sight. For supervised pretraining and validation at scale at Torc-Robotics, we rely on fully automated pseudo-labeling pipelines. Exploiting geometric priors from temporally accumulated LiDAR maps and an iterative update rule enforces joint geometric–semantic consistency while detecting moving objects via inconsistencies. We achieve 3D semantic labels and 3D bounding boxes with human-like quality at 200m+ range required for highway driving. Paper: Exciting work with Torc-Robotics with Filippo Ghilotti, Samuel Brucker, Nahku Saidy, Matteo Matteucci, Mario Bijelic.
Felix Heide18,194 Aufrufe • vor 5 Monaten

Large-scale 3D Scene Generation (all scenes are real-time rendered)!! Physically-grounded generative data without hallucinations is the missing link for robot learning and testing at scale. We introduce a method that directly generates large-scale 3D driving scenes with accurate geometry, allowing for causal view synthesis and generation with object permanence and explicit 3D geometry. This also allows for extreme trajectory extrapolation without failure! We also show that we can build fully data-driven simulators for end-to-end learning with this approach. Project: with the amazing team of Julian Ost, Amogh Joshi , Andrea Ramazzina, Maximilian Bömer, Mario Bijelic.
Felix Heide27,736 Aufrufe • vor 9 Monaten

3D Object Tracking without Training Data? In our nature Machine Intelligence paper ( we recast 3D tracking as an inverse neural rendering task where we fit a scene graph to an image that best explains this image. The method generalizes to completely unseen datasets and is explainable. Project and Code: Fun collaboration between Princeton Computer Science and Torc Robotics, with Julian Ost and Tanushree Banerjee leading this project.
Felix Heide27,858 Aufrufe • vor 9 Monaten

Evaluating Neural Networks at the Speed of Light (with Light!). See live optical inference in the video below. Excited to share recent academic work on optical neural networks as a collection of computing elements embedded in the camera lens! These elements perform computation optically even before an image is captured, using the photons in the scene instead of GPU computation after the capture. We were able to achieve ImageNet classification more than two orders of magnitude faster than conventional neural networks on today's GPUs at almost no power consumption! To do this, we developed an array of metalenses that perform this computation on light from the scene. Project: Paper: Amazing collaboration with Kaixuan Wei, Xiao Li, Johannes Froech, Praneeth Chakravarthula, James Whitehead, Ethan Tseng , Arka Majumdar .
Felix Heide29,888 Aufrufe • vor 1 Jahr

Implicit Neural Light Spheres lets you turn panoramic captures into dynamic wide FOV renders (with real-time rendering!). Instead of generating panoramas with image stitching, we use neural light spheres to jointly estimate the camera path and a high-resolution scene reconstruction to produce novel wide field-of-view projections of the environment. Code, data, and info: Amazing work with Ilya Chugunov @ilyac on bsky, Amogh Joshi, Kiran Murthy, François Bleibel
Felix Heide10,215 Aufrufe • vor 1 Jahr
Keine weiteren Inhalte verfügbar