正在加载视频...
视频加载失败
Introducing Modality Forcing, a recipe for post-training T2I models for SOTA RGB-Depth generation! Text-to-image (T2I) models learn rich representations of the spatial world. How do we build on this prior for high-quality depth generation? 🧵 [1/6]
61,990 次观看 • 7 天前 •via X (Twitter)
0 条评论
暂无评论
原始帖子的评论将显示在这里

