Video yükleniyor...

Video Yüklenemedi

Ana Sayfaya Dön

Introducing ConceptAttention, an approach to interpreting diffusion transformer models! Write a prompt, choose some concepts, generate an image, and get high-quality heatmaps of text concepts. Our method outperforms existing methods like cross attention. Link to demo 👇

36,631 görüntüleme • 1 yıl önce •via X (Twitter)

11 Yorum

Alec Helbling profil fotoğrafı
Alec Helbling1 yıl önce

We have a live interactive demo hosted on Huggingface Spaces:

Alec Helbling profil fotoğrafı
Alec Helbling1 yıl önce

Check out the code here:

Alec Helbling profil fotoğrafı
Alec Helbling1 yıl önce

We repurpose the parameters of multi-modal DiT models (i.e. Flux) without training to create rich contextualized embeddings of text concepts. This allows us to create high quality saliency maps. We wrote a paper about our method:

Rainmaker profil fotoğrafı
Rainmaker2 yıl önce

Join me as I put several Machine Learning models head-to-head to see which one can beat the market and deliver strong returns. In this free Substack post I share several models that deliver better returns with much lower drawdown compared to Buy-and-Hold approach.

Rishi profil fotoğrafı
Rishi1 yıl önce

Very Nice Idea, Explainable AI as a field is not that much explored so nice to see good work in that domain

Rishi profil fotoğrafı
Rishi1 yıl önce

Will this work for not non Diffusion based models ?

Minh Nhat Nguyen profil fotoğrafı
Minh Nhat Nguyen1 yıl önce

i am ... going to see how well this works for video

 007 profil fotoğrafı
 0071 yıl önce

Cool

Julien Blanchon 🇺🇦 profil fotoğrafı
Julien Blanchon 🇺🇦1 yıl önce

Curious about what your intuition about the entanglement between dog and cat features ?

Alec Helbling profil fotoğrafı
Alec Helbling1 yıl önce

Absolutely. Our observations have been that this approach works very well for discerning distinct features (like dog and background) but struggles with examples like you show where you have two very similar concepts. There is clearly some more complex mechanism that allows the model to differentiate between these concepts that unfortunately our approach alone is not able to discern. It is worth noting this limitation is also at play in cross attention mechanisms, and poor object attribute assignment is a known limitation of current diffusion models.

Latent Spacer profil fotoğrafı
Latent Spacer1 yıl önce

I learned a lot from the paper, great work 👏

Benzer Videolar