Загрузка видео...

Не удалось загрузить видео

На главную

Introducing ConceptAttention, an approach to interpreting diffusion transformer models! Write a prompt, choose some concepts, generate an image, and get high-quality heatmaps of text concepts. Our method outperforms existing methods like cross attention. Link to demo 👇

36,631 просмотров • 1 год назад •via X (Twitter)

Комментарии: 11

Фото профиля Alec Helbling
Alec Helbling1 год назад

We have a live interactive demo hosted on Huggingface Spaces:

Фото профиля Alec Helbling
Alec Helbling1 год назад

Check out the code here:

Фото профиля Alec Helbling
Alec Helbling1 год назад

We repurpose the parameters of multi-modal DiT models (i.e. Flux) without training to create rich contextualized embeddings of text concepts. This allows us to create high quality saliency maps. We wrote a paper about our method:

Фото профиля Rainmaker
Rainmaker2 лет назад

Join me as I put several Machine Learning models head-to-head to see which one can beat the market and deliver strong returns. In this free Substack post I share several models that deliver better returns with much lower drawdown compared to Buy-and-Hold approach.

Фото профиля Rishi
Rishi1 год назад

Very Nice Idea, Explainable AI as a field is not that much explored so nice to see good work in that domain

Фото профиля Rishi
Rishi1 год назад

Will this work for not non Diffusion based models ?

Фото профиля Minh Nhat Nguyen
Minh Nhat Nguyen1 год назад

i am ... going to see how well this works for video

Фото профиля  007
 0071 год назад

Cool

Фото профиля Julien Blanchon 🇺🇦
Julien Blanchon 🇺🇦1 год назад

Curious about what your intuition about the entanglement between dog and cat features ?

Фото профиля Alec Helbling
Alec Helbling1 год назад

Absolutely. Our observations have been that this approach works very well for discerning distinct features (like dog and background) but struggles with examples like you show where you have two very similar concepts. There is clearly some more complex mechanism that allows the model to differentiate between these concepts that unfortunately our approach alone is not able to discern. It is worth noting this limitation is also at play in cross attention mechanisms, and poor object attribute assignment is a known limitation of current diffusion models.

Фото профиля Latent Spacer
Latent Spacer1 год назад

I learned a lot from the paper, great work 👏

Похожие видео