正在加载视频...

视频加载失败

Introducing ConceptAttention, an approach to interpreting diffusion transformer models! Write a prompt, choose some concepts, generate an image, and get high-quality heatmaps of text concepts. Our method outperforms existing methods like cross attention. Link to demo 👇

36,631 次观看 • 1 年前 •via X (Twitter)

11 条评论

Alec Helbling 的头像
Alec Helbling1 年前

We have a live interactive demo hosted on Huggingface Spaces:

Alec Helbling 的头像
Alec Helbling1 年前

Check out the code here:

Alec Helbling 的头像
Alec Helbling1 年前

We repurpose the parameters of multi-modal DiT models (i.e. Flux) without training to create rich contextualized embeddings of text concepts. This allows us to create high quality saliency maps. We wrote a paper about our method:

Rainmaker 的头像
Rainmaker2 年前

Join me as I put several Machine Learning models head-to-head to see which one can beat the market and deliver strong returns. In this free Substack post I share several models that deliver better returns with much lower drawdown compared to Buy-and-Hold approach.

Rishi 的头像
Rishi1 年前

Very Nice Idea, Explainable AI as a field is not that much explored so nice to see good work in that domain

Rishi 的头像
Rishi1 年前

Will this work for not non Diffusion based models ?

Minh Nhat Nguyen 的头像
Minh Nhat Nguyen1 年前

i am ... going to see how well this works for video

 007 的头像
 0071 年前

Cool

Julien Blanchon 🇺🇦 的头像
Julien Blanchon 🇺🇦1 年前

Curious about what your intuition about the entanglement between dog and cat features ?

Alec Helbling 的头像
Alec Helbling1 年前

Absolutely. Our observations have been that this approach works very well for discerning distinct features (like dog and background) but struggles with examples like you show where you have two very similar concepts. There is clearly some more complex mechanism that allows the model to differentiate between these concepts that unfortunately our approach alone is not able to discern. It is worth noting this limitation is also at play in cross attention mechanisms, and poor object attribute assignment is a known limitation of current diffusion models.

Latent Spacer 的头像
Latent Spacer1 年前

I learned a lot from the paper, great work 👏

相关视频