Video wird geladen...

Video konnte nicht geladen werden

Zur Startseite

✨ CVPR 2025 highlight: A Distractor-Aware Memory for Visual Object Tracking with SAM2 the authors propose a new distractor-aware memory model for SAM2 and an introspection-based update strategy that jointly addresses the segmentation accuracy as well as tracking robustness 🏡 (1/n)🧵👇

32,669 Aufrufe • vor 11 Monaten •via X (Twitter)

7 Kommentare

Profilbild von GeekyRakshit (e/mad)
GeekyRakshit (e/mad)vor 11 Monaten

the authors redesign SAM2’s memory into two complementary parts: Recent-Appearance Memory (RAM) – a small FIFO buffer that stores the most recent frames (time-stamped) to keep segmentation accurate as the target’s appearance changes. Distractor-Resolving Memory (DRM) – a second buffer that keeps anchor frames able to disambiguate the target from hard external or internal distractors; these slots are not time-stamped, so their influence does not decay. (2/n)🧵👇

Profilbild von GeekyRakshit (e/mad)
GeekyRakshit (e/mad)vor 11 Monaten

plugging DAM and the new update rules into the off-the-shelf SAM 2.1 backbone without any retraining, yields large practical gains, setting a new SoTA (3/n)🧵👇

Profilbild von GeekyRakshit (e/mad)
GeekyRakshit (e/mad)vor 11 Monaten

the authors also create a distractor-distilled tracking dataset DiDi, to address the limitation of low distractor presence in current visual object tracking benchmarks 📀 (4/n)🧵👇

Profilbild von GeekyRakshit (e/mad)
GeekyRakshit (e/mad)vor 11 Monaten

Overall, the paper’s novelty lies in recognising that “one size fits all” memory is insufficient for distractor-heavy tracking and providing a simple, training-free remedy that lifts SAM-based tracking to state-of-the-art levels (5/5)🧵🏁

Profilbild von Ankit
Ankitvor 11 Monaten

Good stuff brother

Profilbild von Rahul
Rahulvor 11 Monaten

very cool

Profilbild von 7racker
7rackervor 11 Monaten

I feel like this example is very edge-casey

Ähnliche Videos

🚀 The Segment Anything Model (SAM) has been upgraded to SAM2, featuring an efficient image encoder for segmenting images and videos. But does SAM2 outperform SAM1 in medical image and video segmentation? We're thrilled to present our paper "Segment Anything in Medical Images and Videos: Benchmark and Deployment"! We comprehensively benchmark SAM2 across 11 medical image modalities and videos. 📄 Paper: 💻 Code: **Highlights:** 1. SAM2 doesn’t always outperform SAM1 in 2D medical images, but excels in video segmentation, making it more accurate and efficient for 3D images, such as CT and MR scans. 2. MedSAM still outperforms SAM2 on most 2D modalities, but SAM2 surpasses MedSAM for 3D image segmentation in a slice-by-slice approach. 3. Segmentation performance varies with model size; sometimes the smallest model outperforms larger ones. 4. Fine-tuning SAM2 significantly boosts its performance for medical image segmentation. While SAM2 may struggle with challenging objects that have unclear boundaries or low contrast, it excels in generating good initial segmentation masks for common medical images and videos. However, the official interface doesn’t support medical data formats and has limitations on video length. To address this, we've developed a 3D Slicer Plugin and Gradio API for efficient 3D medical image and video segmentation. We invite you to try them out and provide feedback! 🔧 Deployment: - 3D Slicer Plugin: - Gradio API: (Note: Due to GPU limitations, the online API is available for only 12 hours and may be slow. We highly recommend deploying the Gradio API with your own computing resources: A big shoutout to Jun Ma (JunMa) who recently joined our UHN AI hub (UHN AI Hub) as Machine Learning Lead, and kudos to all co-authors: Sumin Kim, Feifei Li, Mohammed Baharoon (Mohammed Baharoon), Reza Asakereh, and Hongwei Lyu! This is true teamwork! Looking forward to collaborating with the community to advance 3D medical image and video segmentation foundation models! University Health Network U of T Department of Computer Science Department of Laboratory Medicine & Pathobiology Temerty Centre for AI in Medicine (T-CAIREM) Vector Institute #MedTech #AIinHealthcare #DeepLearning #MedicalImaging #SAM2 #MedSAM #AIResearch

Bo Wang

178,419 Aufrufe • vor 1 Jahr