Video yükleniyor...

Video Yüklenemedi

Ana Sayfaya Dön

✨ CVPR 2025 highlight: A Distractor-Aware Memory for Visual Object Tracking with SAM2 the authors propose a new distractor-aware memory model for SAM2 and an introspection-based update strategy that jointly addresses the segmentation accuracy as well as tracking robustness 🏡 (1/n)🧵👇

32,669 görüntüleme • 11 ay önce •via X (Twitter)

7 Yorum

GeekyRakshit (e/mad) profil fotoğrafı
GeekyRakshit (e/mad)11 ay önce

the authors redesign SAM2’s memory into two complementary parts: Recent-Appearance Memory (RAM) – a small FIFO buffer that stores the most recent frames (time-stamped) to keep segmentation accurate as the target’s appearance changes. Distractor-Resolving Memory (DRM) – a second buffer that keeps anchor frames able to disambiguate the target from hard external or internal distractors; these slots are not time-stamped, so their influence does not decay. (2/n)🧵👇

GeekyRakshit (e/mad) profil fotoğrafı
GeekyRakshit (e/mad)11 ay önce

plugging DAM and the new update rules into the off-the-shelf SAM 2.1 backbone without any retraining, yields large practical gains, setting a new SoTA (3/n)🧵👇

GeekyRakshit (e/mad) profil fotoğrafı
GeekyRakshit (e/mad)11 ay önce

the authors also create a distractor-distilled tracking dataset DiDi, to address the limitation of low distractor presence in current visual object tracking benchmarks 📀 (4/n)🧵👇

GeekyRakshit (e/mad) profil fotoğrafı
GeekyRakshit (e/mad)11 ay önce

Overall, the paper’s novelty lies in recognising that “one size fits all” memory is insufficient for distractor-heavy tracking and providing a simple, training-free remedy that lifts SAM-based tracking to state-of-the-art levels (5/5)🧵🏁

Ankit profil fotoğrafı
Ankit11 ay önce

Good stuff brother

Rahul profil fotoğrafı
Rahul11 ay önce

very cool

7racker profil fotoğrafı
7racker11 ay önce

I feel like this example is very edge-casey

Benzer Videolar

🚀 The Segment Anything Model (SAM) has been upgraded to SAM2, featuring an efficient image encoder for segmenting images and videos. But does SAM2 outperform SAM1 in medical image and video segmentation? We're thrilled to present our paper "Segment Anything in Medical Images and Videos: Benchmark and Deployment"! We comprehensively benchmark SAM2 across 11 medical image modalities and videos. 📄 Paper: 💻 Code: **Highlights:** 1. SAM2 doesn’t always outperform SAM1 in 2D medical images, but excels in video segmentation, making it more accurate and efficient for 3D images, such as CT and MR scans. 2. MedSAM still outperforms SAM2 on most 2D modalities, but SAM2 surpasses MedSAM for 3D image segmentation in a slice-by-slice approach. 3. Segmentation performance varies with model size; sometimes the smallest model outperforms larger ones. 4. Fine-tuning SAM2 significantly boosts its performance for medical image segmentation. While SAM2 may struggle with challenging objects that have unclear boundaries or low contrast, it excels in generating good initial segmentation masks for common medical images and videos. However, the official interface doesn’t support medical data formats and has limitations on video length. To address this, we've developed a 3D Slicer Plugin and Gradio API for efficient 3D medical image and video segmentation. We invite you to try them out and provide feedback! 🔧 Deployment: - 3D Slicer Plugin: - Gradio API: (Note: Due to GPU limitations, the online API is available for only 12 hours and may be slow. We highly recommend deploying the Gradio API with your own computing resources: A big shoutout to Jun Ma (JunMa) who recently joined our UHN AI hub (UHN AI Hub) as Machine Learning Lead, and kudos to all co-authors: Sumin Kim, Feifei Li, Mohammed Baharoon (Mohammed Baharoon), Reza Asakereh, and Hongwei Lyu! This is true teamwork! Looking forward to collaborating with the community to advance 3D medical image and video segmentation foundation models! University Health Network U of T Department of Computer Science Department of Laboratory Medicine & Pathobiology Temerty Centre for AI in Medicine (T-CAIREM) Vector Institute #MedTech #AIinHealthcare #DeepLearning #MedicalImaging #SAM2 #MedSAM #AIResearch

Bo Wang

178,419 görüntüleme • 1 yıl önce