Video yükleniyor...

Video Yüklenemedi

Bu video yüklenirken bir sorun oluştu. Bu geçici bir ağ sorunundan kaynaklanıyor olabilir veya video kullanılamıyor olabilir.

Ana Sayfaya Dön

Spent some time this week playing with depth estimation models. Monocular depth estimation has come a long way, but in terms of metric accuracy and geometric consistency, stereo depth still has it beat by a wide margin. DepthPro (mono) vs FoundationStereo (stereo)

FrostyFridge

1,466 subscribers

27,774 görüntüleme • 1 yıl önce •via X (Twitter)

Bilim & Teknoloji Eğitim

Anya Rossi• Live Now

Private livecam show

0 Yorum

Yorum bulunmuyor

Orijinal gönderinin yorumları burada görünecek

Benzer Videolar

Discover the right 3D Geometric Foundation Model for your task—whether it’s stereo matching, multi-view depth estimation, video depth, pose estimation, semantic understanding, or novel view synthesis. Explore more insights in our #E3DBench #FoundationModel #3D #GaussianSplatting. Project Webpage:

Discover the right 3D Geometric Foundation Model for your task—whether it’s stereo matching, multi-view depth estimation, video depth, pose estimation, semantic understanding, or novel view synthesis. Explore more insights in our #E3DBench #FoundationModel #3D #GaussianSplatting. Project Webpage:

Zhiwen(Aaron) Fan

29,920 görüntüleme • 1 yıl önce

🤔Applying Depth Estimation models directly to videos can result in inconsistency between frames. 💪Well, not anymore. 🔥ChronoDepth is a new approach to video depth estimation that focuses on achieving both accuracy within each frame and consistency across frames. 📜More👇

🤔Applying Depth Estimation models directly to videos can result in inconsistency between frames. 💪Well, not anymore. 🔥ChronoDepth is a new approach to video depth estimation that focuses on achieving both accuracy within each frame and consistency across frames. 📜More👇

Gradio

19,986 görüntüleme • 2 yıl önce

AnyDepth: Lightweight zero-shot monocular depth estimation; surpasses DPT; - nicely preserves detail.

AnyDepth: Lightweight zero-shot monocular depth estimation; surpasses DPT; - nicely preserves detail.

Wildminder

18,152 görüntüleme • 4 ay önce

Gave a robot 3D vision with just a regular camera👁️ Full Tutorial: Deployed #Depth Anything V3 on NVIDIA Robotics Jetson AGX Orin. It estimates depth from 2D images in real-time—no special sensors needed. Just need #Monocular depth estimation + #TensorRT optimization + #ROS2 integration. 👉 Learn more about reComputer Robotics J5011: #TheAIHardwarePartner

Gave a robot 3D vision with just a regular camera👁️ Full Tutorial: Deployed #Depth Anything V3 on NVIDIA Robotics Jetson AGX Orin. It estimates depth from 2D images in real-time—no special sensors needed. Just need #Monocular depth estimation + #TensorRT optimization + #ROS2 integration. 👉 Learn more about reComputer Robotics J5011: #TheAIHardwarePartner

Seeed Studio

16,966 görüntüleme • 3 ay önce

Depth Any Video with Scalable Synthetic Data AI physicists and chemists continue to make strides in depth estimation from video. Check out this new paper featuring some impressive examples. See the thread for more details (unfortunately no code yet). Abstract: Video depth estimation has long been hindered by the scarcity of consistent and scalable ground truth data, leading to inconsistent and unreliable results. In this paper, we introduce Depth Any Video, a model that tackles the challenge through two key innovations. First, we develop a scalable synthetic data pipeline, capturing real-time video depth data from diverse game environments, yielding 40,000 video clips of 5-second duration, each with precise depth annotations. Second, we leverage the powerful priors of generative video diffusion models to handle real-world videos effectively, integrating advanced techniques such as rotary position encoding and flow matching to further enhance flexibility and efficiency. Unlike previous models, which are limited to fixed-length video sequences, our approach introduces a novel mixed-duration training strategy that handles videos of varying lengths and performs robustly across different frame rates 0 - even on single frames. At inference, we propose a depth interpolation method that enables our model to infer high-resolution video depth across sequences of up to 150 frames. Our model outperforms all previous generative depth models in terms of spatial accuracy and temporal consistency.

Depth Any Video with Scalable Synthetic Data AI physicists and chemists continue to make strides in depth estimation from video. Check out this new paper featuring some impressive examples. See the thread for more details (unfortunately no code yet). Abstract: Video depth estimation has long been hindered by the scarcity of consistent and scalable ground truth data, leading to inconsistent and unreliable results. In this paper, we introduce Depth Any Video, a model that tackles the challenge through two key innovations. First, we develop a scalable synthetic data pipeline, capturing real-time video depth data from diverse game environments, yielding 40,000 video clips of 5-second duration, each with precise depth annotations. Second, we leverage the powerful priors of generative video diffusion models to handle real-world videos effectively, integrating advanced techniques such as rotary position encoding and flow matching to further enhance flexibility and efficiency. Unlike previous models, which are limited to fixed-length video sequences, our approach introduces a novel mixed-duration training strategy that handles videos of varying lengths and performs robustly across different frame rates 0 - even on single frames. At inference, we propose a depth interpolation method that enables our model to infer high-resolution video depth across sequences of up to 150 frames. Our model outperforms all previous generative depth models in terms of spatial accuracy and temporal consistency.

MrNeRF

27,428 görüntüleme • 1 yıl önce

Video diffusion models are just overqualified depth estimators! Deterministic single-pass depth estimation based on WanV2.1. - SOTA 5.5 AbsRel on ScanNet - data-efficient than baselines; - no temporal flicker + infinite-length estimation w/ zero scale drift.

Video diffusion models are just overqualified depth estimators! Deterministic single-pass depth estimation based on WanV2.1. - SOTA 5.5 AbsRel on ScanNet - data-efficient than baselines; - no temporal flicker + infinite-length estimation w/ zero scale drift.

Wildminder

49,209 görüntüleme • 2 ay önce

Introducing 🛹 RollingDepth 🛹 — a universal monocular depth estimator for arbitrarily long videos! Our paper, “Video Depth without Video Models,” delivers exactly that, setting new standards in temporal consistency. Check out more details in the thread 🧵

Introducing 🛹 RollingDepth 🛹 — a universal monocular depth estimator for arbitrarily long videos! Our paper, “Video Depth without Video Models,” delivers exactly that, setting new standards in temporal consistency. Check out more details in the thread 🧵

Anton Obukhov

49,820 görüntüleme • 1 yıl önce

One more turn with this #AR: added underwater transition. All real-time, based on #ML scene depth estimation. Built in Effect House.

One more turn with this #AR: added underwater transition. All real-time, based on #ML scene depth estimation. Built in Effect House.

Denis Rossiev ᯅ/acc

23,561 görüntüleme • 2 yıl önce

Google just revealed an ABSOLUTE depth estimation model 🤯 As opposed to recent depth models (Marigold, PatchFusion) which aim for maximum details, DMD aims to estimate the ABSOLUTE depth (in meters) within the image More details below ⬇️⬇️

Google just revealed an ABSOLUTE depth estimation model 🤯 As opposed to recent depth models (Marigold, PatchFusion) which aim for maximum details, DMD aims to estimate the ABSOLUTE depth (in meters) within the image More details below ⬇️⬇️

Alex Carlier

199,515 görüntüleme • 2 yıl önce

Left: raw depth from the OAK-D Pro W camera Right: depth from FoundationStereo

Left: raw depth from the OAK-D Pro W camera Right: depth from FoundationStereo

Yu Xiang

32,598 görüntüleme • 3 ay önce

Monocular pose estimation has gotten really good Grab any 2D video and transfer the performance to a 3D character

Monocular pose estimation has gotten really good Grab any 2D video and transfer the performance to a 3D character

Bilawal Sidhu

27,074 görüntüleme • 1 yıl önce

I added a node to my custom node that converts the depth image to a normal map! This is useful for 2D-style images where Normals Estimation doesn't work well.

I added a node to my custom node that converts the depth image to a normal map! This is useful for 2D-style images where Normals Estimation doesn't work well.

toyxyz

117,354 görüntüleme • 1 yıl önce

If you are interested in ego-centric data capture, no need to get a headset or smart glasses. Get a $10 head strap mount and put one of these cameras on it. You get RGB, stereo, depth, 3D point clouds, and camera trajectory. Visualization powered by Rerun. Pretty cool!

If you are interested in ego-centric data capture, no need to get a headset or smart glasses. Get a $10 head strap mount and put one of these cameras on it. You get RGB, stereo, depth, 3D point clouds, and camera trajectory. Visualization powered by Rerun. Pretty cool!

Yu Xiang

29,882 görüntüleme • 5 ay önce

Nano is a depth-aware atmospheric haze plugin that uses ML depth estimation to add physically accurate fog and light scattering to your footage. Works *best* on log footage with visible light sources - it analyzes scene highlights then creates airlight (atmospheric scatter) and halation (light bloom) that responds to actual depth in the scene. Pretty clever approach to getting that cinematic haze look without having to pump a fog machine on set. Makes the OG Trapcode Shine look extremely dated (basically 2D light streaks masked by luminance values), and is yet way more controllable than the current crop of generative AI video-to-video tools.

Nano is a depth-aware atmospheric haze plugin that uses ML depth estimation to add physically accurate fog and light scattering to your footage. Works best on log footage with visible light sources - it analyzes scene highlights then creates airlight (atmospheric scatter) and halation (light bloom) that responds to actual depth in the scene. Pretty clever approach to getting that cinematic haze look without having to pump a fog machine on set. Makes the OG Trapcode Shine look extremely dated (basically 2D light streaks masked by luminance values), and is yet way more controllable than the current crop of generative AI video-to-video tools.

Bilawal Sidhu

275,142 görüntüleme • 11 ay önce

StereoWorld Geometry-Aware Monocular-to-Stereo Video Generation

StereoWorld Geometry-Aware Monocular-to-Stereo Video Generation

AK

11,372 görüntüleme • 5 ay önce

This is how a river depth is measured with a laser beamed from an aircraft: bathymetric LiDAR can measure depths from 0.9 m to 40 m with a vertical accuracy of about 15 cm and horizontal accuracy of 2.5 m

This is how a river depth is measured with a laser beamed from an aircraft: bathymetric LiDAR can measure depths from 0.9 m to 40 m with a vertical accuracy of about 15 cm and horizontal accuracy of 2.5 m

Massimo

83,564 görüntüleme • 2 yıl önce

This is how a river depth is measured using an aircraft and a laser beam. The bathymetric LiDAR can measure depths upto 40m with a accuracy of about 15 cm

This is how a river depth is measured using an aircraft and a laser beam. The bathymetric LiDAR can measure depths upto 40m with a accuracy of about 15 cm

HOW THINGS WORK

553,964 görüntüleme • 2 yıl önce

BIG NEWS. Blender 4.1 has realtime multipass of the Zdepth - meaning you can now viewport composite with depth in realtime! Exciting time. And.. it happenstance will also be in Bforaritsts 4.0.1.

BIG NEWS. Blender 4.1 has realtime multipass of the Zdepth - meaning you can now viewport composite with depth in realtime! Exciting time. And.. it happenstance will also be in Bforaritsts 4.0.1.

Trinumedia

16,245 görüntüleme • 2 yıl önce

This improved depth fade uses a dedicated depth texture map to offset values with RemapValueRange, PixelDepth, and SceneDepth, for a more volumetric result #UnrealEngine5 #VFX #realtimeVFX #gamedev

This improved depth fade uses a dedicated depth texture map to offset values with RemapValueRange, PixelDepth, and SceneDepth, for a more volumetric result #UnrealEngine5 #VFX #realtimeVFX #gamedev

TUATARA

37,774 görüntüleme • 1 yıl önce

Ni-ki’s One in A Billion in Japanese just hits different. His voice has matured so nicely and sounds so full of depth.

Ni-ki’s One in A Billion in Japanese just hits different. His voice has matured so nicely and sounds so full of depth.

rest

10,027 görüntüleme • 4 ay önce