Loading video...

Video Failed to Load

Go Home

Meta releases VGGSfM Visual Geometry Grounded Deep Structure From Motion Structure-from-motion (SfM) is a long-standing problem in the computer vision community, which aims to reconstruct the camera poses and 3D structure of a scene from a set of unconstrained 2D images. Classical frameworks solve this problem in an incremental...

95,553 views • 1 year ago •via X (Twitter)

0 Comments

No comments available

Comments from the original post will appear here

Related Videos

Wonderland: Navigating 3D Scenes from a Single Image Contributions: • First, we introduce a representation for controllable 3D generation by leveraging the generative priors from camera-guided video diffusion models. Unlike image models, video diffusion models are trained on extensive video datasets. This enables them to capture comprehensive spatial relationships within scenes across multiple views and embed a form of "3D awareness" in their latent space, which allows us to maintain 3D consistency in novel view synthesis. • Second, to achieve controllable novel view generation, we empower video models with precise control over specified camera motions. We introduce a novel dual-branch conditioning mechanism that effectively incorporates desired diverse camera trajectories into the video diffusion model. This enables expansion of a single image into a multi-view consistent capture of a 3D scene with precise pose control. • Third, to achieve efficient 3D reconstruction, we directly transform video latents into 3DGS. We propose a novel latent-based large reconstruction model (LaLRM) that lifts video latents to 3D in a feed-forward manner. With this design, during inference, our model directly predicts 3DGS from a single input image, effectively aligning the generation and reconstruction tasks—and bridging image space and 3D space—through the video latent space. Compared with reconstructing scenes from images, the video latent space offers a 256× spatial-temporal reduction while retaining essential and consistent 3D structural details. Such a high degree of compression is crucial, as it allows the LaLRM to handle a wider range of 3D scenes within the reconstruction framework, with the same memory constraints.

MrNeRF

52,801 views • 1 year ago

Colmap 4.0 was very recently released, so it inspired me to do some work to better understand it and its new capabilities with Rerun. I want to really understand how Colmap, and in particular, pycolmap, works outside of just calling it via the CLI. So my goal is to use the low-level pycolmap API to log every part of the pipeline. The explicit goal is to have an alternative to the SQLite database that I can utilize. Instead of SQLite, I want to try logging everything directly to rerun and use RRD. This means I can have deep inspectability and still save the features/matches/2D view geometry, but be able to view it directly in rerun. I think this is one of the superpowers that rerun provides; data and visualizations are deeply integrated. As I'm often working with sequential data (videos), I'm going to specifically focus on four things: 1. Monocular Video Simple: Calls high-level APIs such as pycolmap.extract_features, pycolmap.match_sequential, pycolmap.incremental_mapping. These are basically identical to the CLI options and provide a good baseline. 2. Monocular Video Streamed: Take the above high-level APIs and break them down to their iterator version, logging each component in a streamed manner. This way, I can stream the intermediate features to rerun while the extraction/matching/mapping is happening. 3. Rig with unknown calibration: <- WHAT THE VIDEO SHOWS This is probably the most interesting version and the first one I've been working on. It allows one to set a rig between known sensors, such as in VR/AR devices, leading to much better reconstructions with multiple cameras. This is the case where we don't know the calibration a priori, so we have to run a reconstruction twice: once as a normal Colmap reconstruction with no rig constraints, use this to generate the constraints, and then do it again with the newly found rig. 4. Rig with known calibration: This is the RoboCap example, where we have a pre-calibrated set of sensors, so we don't need to run the two reconstructions and also gain better matching between cameras, both spatially and temporally. Again, this leads to a much better reconstruction! Along with all this, GLOMAP has become a first-class global mapper, making it super easy to use directly within pycolmap! I'm excited to do more with this and compare it to things like pycuvslam, vipe, and other alternatives.

Pablo Vela

30,070 views • 2 months ago

The U.S. Special Presidential Envoy for Ukraine, General Keith Kellogg Keith Kellogg, is in Kyiv today. I am grateful for a constructive meeting. We discussed various vectors of cooperation – how to achieve real peace and guarantee Ukraine’s security. These include projects within the PURL initiative for financing production and procurement of Patriot systems, strong bilateral agreements on co-production of drones and weapons that we have proposed to America. We count on a positive response from the United States. We had a substantive discussion on stepping up pressure on the Russians and what we can do together with partners in tariff and sanctions policy to enable a meeting at the leaders’ level at the earliest and bring this war to an end. A trilateral leaders’ format is undoubtedly the most effective. We also discussed the return of abducted Ukrainian children, international cooperation on this track, and the conditions in which our children are being held. I expressed condolences to the American people over the horrific murder of Charlie Kirk and thanked President Trump for his condolences and response to the brutal murder of Ukrainian citizen Iryna Zarutska in North Carolina. It is important that justice prevail every time violence seeks to take hold. We are also preparing for the 80th session of the UN General Assembly in New York. We discussed planned events, coordination between Ukraine and the U.S., and work within the Coalition of the Willing. We are working on potential meetings and various formats.

Volodymyr Zelenskyy / Володимир Зеленський

226,833 views • 9 months ago