Loading video...
Video Failed to Load
Introducing Modality Forcing, a recipe for post-training T2I models for SOTA RGB-Depth generation! Text-to-image (T2I) models learn rich representations of the spatial world. How do we build on this prior for high-quality depth generation? 🧵 [1/6]
61,373 views • 6 days ago •via X (Twitter)
0 Comments
No comments available
Comments from the original post will appear here

