Loading video...
Video Failed to Load
LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models paper page: github: Recent advancements in text-to-image generation with diffusion models have yielded remarkable results synthesizing highly realistic and diverse images. However, these models still encounter difficulties when generating images from prompts that demand spatial or... show more
83,657 views • 2 years ago •via X (Twitter)
6 Comments

Boyi Li2 years ago
Thanks @_akhaliq for sharing our work!

zorr0 (ττ)2 years ago
@replytensor

haareblond2 years ago
cool but still feels hacky

Takomo AI2 years ago
That's great progress!

Cavit Erginsoy2 years ago
@yuliangxiu I saw this about a month ago and had played around with it, is the same or a parallel dev? Wish someone built an extension for A1111

VIJAY KUMAR REDDY BOMMIREDDY2 years ago
Impressive work! Expanding the text-to-image domain with diffusion models showcases great potential. Looking forward to exploring the paper and GitHub repository. Keep up the great work! 👍
