Loading video...
Video Failed to Load
Diffusion models generate high-quality images but require hundreds of forward passes. MIT CSAIL and Adobe Research introduce Distribution Matching Distillation (DMD), a distillation approach that converts costly multi-step diffusion models into fast one-step generators. A thread 🧵
34,347 views • 2 years ago •via X (Twitter)
9 Comments

DMD trains a one-step generator that maps random noise into realistic images, consisting of two key components. First up: it uses a regression loss to anchor the mapping process, ensuring a coarse organization of the image space, enhancing the stability of the training phase.

Additionally, it employs a distribution matching loss to guarantee that the likelihood of generating a specific image w/the student model aligns w/its actual frequency of occurrence in the real world.

The gradient of this loss is formulated as the difference between two diffusion models’ output, trained on real and fake samples respectively.

DMD achieves a strong 11.49 FID on zero-shot COCO-30K, comparable to Stable Diffusion v1.5 while being 30X faster. Compared to previous approaches, it notably balances image quality with sample diversity.

DMD paves the way for real-time visual generation. This same approach could improve diffusion-based generative models across various fields, from design, to scientific discovery and beyond, by significantly enhancing speed and effectiveness.

Paper: Authors: @TianweiY, @m_gharbi, @rzhang88, @elishechtman, @fredodurand, Bill Freeman, and Taesung Park. Project page: MIT News:

@AdobeResearch will you release the code / model for this?

@AdobeResearch Could this approach of distribution matching loss be applied to other generative AI tasks besides image generation? For example, text generation or music composition?

@rzhang88 @AdobeResearch Good for you my friend, we are try use your model colorization(which is 4 years ago) for sneakers now, lol, thank you for everything.
