Loading video...

Video Failed to Load

There was a problem loading this video. This could be due to a temporary network issue or the video might be unavailable.

CVPR 2025 papers pt. 2 - SAMWISE SAMWISE adds language understanding and temporal reasoning to SAM2; you can segment and track objects in videos just by describing them more papers: ↓ more

SkalskiP

34,807 subscribers

20,528 views • 1 year ago •via X (Twitter)

Anya Rossi• Live Now

Private livecam show

9 Comments

SkalskiP @ CVPR2025 🇺🇸1 year ago

- paper: - code: - video:

SkalskiP @ CVPR2025 🇺🇸1 year ago

SAM2 supports visual prompts like points and boxes but have no native support for text prompts. I often showed how combining SAM2 with VLMs enabled language-guided image segmentation. SAMWISE allows direct text-driven video object segmentation.

SkalskiP @ CVPR2025 🇺🇸1 year ago

SAM2 can make mistakes that, without human correction, will persist in subsequent frames. SAMWISE can auto correct it's own mistakes.

SkalskiP @ CVPR2025 🇺🇸1 year ago

SAMWISE uses a frozen Segment Anything 2 (SAM2) model and a frozen text encoder. it adds a special module called the Cross-Modal Temporal Adapter (CMT), which helps the model combine information from both the video and the text and follow changes over time.

SkalskiP @ CVPR2025 🇺🇸1 year ago

Conditional Memory Encoder (CME) helps the model notice when a new object fits your prompt better, so SAMWISE can automatically switch tracking, even if the correct object appears later or is hidden for a while.

SkalskiP @ CVPR2025 🇺🇸1 year ago

full poster explaining text understanding, temporal modeling, tracking bias, and much more

Nigam Arora1 year ago

In 2025, how much more money can you make in the stock market by following the most accurate analysis?

Team Reagent1 year ago

Can it do this in real-time?

Team Reagent1 year ago

Oh we are DEFINITELY taking a look at this! Wow!!

Related Videos

I got you covered Explore all CVPR papers here: Oral papers, Spotlight papers, and all papers categorized by domain

I got you covered Explore all CVPR papers here: Oral papers, Spotlight papers, and all papers categorized by domain

Niels Rogge

96,618 views • 18 days ago

Apache 2.0 license🔥 On-device deployment ready Extends SAM2 for tracking objects in videos 🔥 Click-to-segment support EdgeTAM by Meta !

Apache 2.0 license🔥 On-device deployment ready Extends SAM2 for tracking objects in videos 🔥 Click-to-segment support EdgeTAM by Meta !

Gradio

23,301 views • 1 year ago

PhD Students – How to identify related papers for your research paper? 1. Go to 2. Paste your DOI or title of your research paper 3. ResearchCollab will display the network of papers 4. Explore the network in following ways → Filter the network of papers based on research field → Filter the network based on thematic clusters → Filter the network based on keywords → and much more As you find related papers, keep saving them. This use case has multiple advantages. ➟ You won’t miss a related paper to cite in your paper ➟ You can identify related papers for your lit review ➟ You can explore new research directions and trends ➟ You can visualize the entire collection of papers As a researcher, work smarter not just harder. For this, try ResearchCollab:

PhD Students – How to identify related papers for your research paper? 1. Go to 2. Paste your DOI or title of your research paper 3. ResearchCollab will display the network of papers 4. Explore the network in following ways → Filter the network of papers based on research field → Filter the network based on thematic clusters → Filter the network based on keywords → and much more As you find related papers, keep saving them. This use case has multiple advantages. ➟ You won’t miss a related paper to cite in your paper ➟ You can identify related papers for your lit review ➟ You can explore new research directions and trends ➟ You can visualize the entire collection of papers As a researcher, work smarter not just harder. For this, try ResearchCollab:

Faheem Ullah

44,643 views • 12 days ago

SAM 2 from Meta FAIR is the first unified model for real-time, promptable object segmentation in images & videos. Using the model in our web-based demo you can segment, track and apply effects to objects in video in just a few clicks. Try SAM 2 ➡️

SAM 2 from Meta FAIR is the first unified model for real-time, promptable object segmentation in images & videos. Using the model in our web-based demo you can segment, track and apply effects to objects in video in just a few clicks. Try SAM 2 ➡️

AI at Meta

88,918 views • 1 year ago

Introducing Tensortonic research > Implement ML papers in cloud-native IDEs > Breakdown of all papers to architecture, math, and code > State-of-the-art papers like Transformers, BERT, ViT, DDPM, VAE, GANs and many more

Introducing Tensortonic research > Implement ML papers in cloud-native IDEs > Breakdown of all papers to architecture, math, and code > State-of-the-art papers like Transformers, BERT, ViT, DDPM, VAE, GANs and many more

pdawg

66,788 views • 4 months ago

2/ Check out how Gemini 3.5 Flash instantly digests dense academic papers and autonomously codes a fully interactive, visual website explaining the intricacies of the research. It's an incredible stress test that seamlessly merges massive long context, deep reasoning, complex coding, and ultra-low latency. It really helps you distill papers down to their essence and aid your understanding!

2/ Check out how Gemini 3.5 Flash instantly digests dense academic papers and autonomously codes a fully interactive, visual website explaining the intricacies of the research. It's an incredible stress test that seamlessly merges massive long context, deep reasoning, complex coding, and ultra-low latency. It really helps you distill papers down to their essence and aid your understanding!

Jeff Dean

88,779 views • 1 month ago

Papers With Code SOTA is back 🚀 We used Gemini 3 to process millions of charts and tables across arXiv + the web to surface state-of-the-art AI research in reasoning, computer-use, OCR, and more Track the best-performing methods and see which benchmarks are getting adopted

Papers With Code SOTA is back 🚀 We used Gemini 3 to process millions of charts and tables across arXiv + the web to surface state-of-the-art AI research in reasoning, computer-use, OCR, and more Track the best-performing methods and see which benchmarks are getting adopted

alphaXiv

17,856 views • 7 months ago

Introducing GLM-5.1 for understanding research papers 🚀 Highlight any section of a paper to ask questions and “@” other papers for quick context, comparisons, and benchmark references

Introducing GLM-5.1 for understanding research papers 🚀 Highlight any section of a paper to ask questions and “@” other papers for quick context, comparisons, and benchmark references

alphaXiv

30,882 views • 2 months ago

Introducing o3-mini for understanding arXiv papers 🚀 Highlight any section of a paper to ask questions and “@” other papers for quick context, comparisons, and benchmark references

Introducing o3-mini for understanding arXiv papers 🚀 Highlight any section of a paper to ask questions and “@” other papers for quick context, comparisons, and benchmark references

alphaXiv

66,598 views • 1 year ago

Introducing Grok 4 for understanding research papers 🚀 Highlight any section of a paper to ask questions and “@” other papers for quick context, comparisons, and benchmark references

Introducing Grok 4 for understanding research papers 🚀 Highlight any section of a paper to ask questions and “@” other papers for quick context, comparisons, and benchmark references

alphaXiv

88,375 views • 11 months ago

Introducing GPT OSS for understanding research papers 🚀 Highlight any section of a paper to ask questions and “@” other papers for quick context, comparisons, and benchmark references

Introducing GPT OSS for understanding research papers 🚀 Highlight any section of a paper to ask questions and “@” other papers for quick context, comparisons, and benchmark references

alphaXiv

29,502 views • 10 months ago

How to extract data from papers for literature review in seconds? 1. Go to 2. Upload 2-3 papers you already know are relevant 3. Start with clicking on one paper 4. ResearchCollab do the following for the paper ✦ Provide a short overview of the paper ✦ Identify weaknesses in the paper ✦ Share papers that contrast the existing paper ✦ Extract meta data about the paper ✦ Evaluate each part (e.g., methodology) of the paper ✦ Provide AI chat to extract any other data 5. Now click on the Related Papers button at top right 6. You will get papers related to the seed paper 7. You can click on each paper and read its abstract 8. If you find it relevant, just add it to your list of papers 9. Do the same for the other 2 seed papers 10. This way you will collect all relevant papers 11. And extract data from those papers After this, analyse the data and report the findings. Try ResearchCollab today:

How to extract data from papers for literature review in seconds? 1. Go to 2. Upload 2-3 papers you already know are relevant 3. Start with clicking on one paper 4. ResearchCollab do the following for the paper ✦ Provide a short overview of the paper ✦ Identify weaknesses in the paper ✦ Share papers that contrast the existing paper ✦ Extract meta data about the paper ✦ Evaluate each part (e.g., methodology) of the paper ✦ Provide AI chat to extract any other data 5. Now click on the Related Papers button at top right 6. You will get papers related to the seed paper 7. You can click on each paper and read its abstract 8. If you find it relevant, just add it to your list of papers 9. Do the same for the other 2 seed papers 10. This way you will collect all relevant papers 11. And extract data from those papers After this, analyse the data and report the findings. Try ResearchCollab today:

Faheem Ullah

26,494 views • 4 months ago

ChatGPT for Research No more sifting through endless research papers to find the answer you need. Just ask any question and Consensus AI will deliver a GPT-4-powered summary of the top 5-10 papers.

ChatGPT for Research No more sifting through endless research papers to find the answer you need. Just ask any question and Consensus AI will deliver a GPT-4-powered summary of the top 5-10 papers.

Shubham Saboo

1,142,780 views • 3 years ago

Introducing the Daily Papers SKILL.md Enables agents to > read paper content as markdown > search papers > find linked Hugging Face models and datasets > fetch the papers API > and more! Link below ⬇️

Introducing the Daily Papers SKILL.md Enables agents to > read paper content as markdown > search papers > find linked Hugging Face models and datasets > fetch the papers API > and more! Link below ⬇️

DailyPapers

34,059 views • 3 months ago

Characters like Ron Weasley, Samwise Gamgee, and Bruce Banner are beta males who excel in supportive roles.

Characters like Ron Weasley, Samwise Gamgee, and Bruce Banner are beta males who excel in supportive roles.

Modern Men's Guide

2,240,964 views • 2 years ago

Introducing Gemini 3 Flash for understanding research papers 🚀 Highlight any section of a paper to ask questions and “@” other papers for quick context, comparisons, and benchmark references

Introducing Gemini 3 Flash for understanding research papers 🚀 Highlight any section of a paper to ask questions and “@” other papers for quick context, comparisons, and benchmark references

alphaXiv

48,596 views • 6 months ago

Introducing Claude Sonnet 4.6 for understanding research papers 🚀 Highlight any section of a paper to ask questions and “@” other papers for quick context, comparisons, and benchmark references

Introducing Claude Sonnet 4.6 for understanding research papers 🚀 Highlight any section of a paper to ask questions and “@” other papers for quick context, comparisons, and benchmark references

alphaXiv

49,070 views • 4 months ago

Introducing GLM 5 Turbo for understanding research papers 🚀 Highlight any section of a paper to ask questions and “@” other papers for quick context, comparisons, and benchmark references

Introducing GLM 5 Turbo for understanding research papers 🚀 Highlight any section of a paper to ask questions and “@” other papers for quick context, comparisons, and benchmark references

alphaXiv

23,361 views • 3 months ago

Introducing Gemini 3 Pro for understanding research papers 🚀 Highlight any section of a paper to ask questions and “@” other papers for quick context, comparisons, and benchmark references

Introducing Gemini 3 Pro for understanding research papers 🚀 Highlight any section of a paper to ask questions and “@” other papers for quick context, comparisons, and benchmark references

alphaXiv

50,001 views • 7 months ago