Video yรผkleniyor...

Video Yรผklenemedi

๐Ÿ“ข๐Ÿ“ข๐Ÿ“ข ๐€๐‚๐Ÿ‘๐ƒ: Analyzing and Improving 3D Camera Control in Video Diffusion Transformers TL;DR: for 3D camera control in generative video, it really helps knowing *which* part of your model you should mess with Internship by Sherwin Bahmani at Snap

23,040 gรถrรผntรผleme โ€ข 1 yฤฑl รถnce โ€ขvia X (Twitter)

5 Yorum

Andrea Tagliasacchi ๐Ÿ‡จ๐Ÿ‡ฆ profil fotoฤŸrafฤฑ
Andrea Tagliasacchi ๐Ÿ‡จ๐Ÿ‡ฆ1 yฤฑl รถnce

TL;DR (expanded): 1) "when" in the diffusion process you condition for camera matters (i.e. noise scheduler) 2) "how" in the diffusion process you condition for camera maters (i.e. architecture) 3) "what data" you give to your diffusion model to condition camera matters

Andrea Tagliasacchi ๐Ÿ‡จ๐Ÿ‡ฆ profil fotoฤŸrafฤฑ
Andrea Tagliasacchi ๐Ÿ‡จ๐Ÿ‡ฆ1 yฤฑl รถnce

Why, you ask? 1) camera motion is low-frequency... early denoising iterations deal with low-frequency content 2) early DiT blocks are enough to fine-tune for camera control... more and you lose quality 3) model needs to know what a static view of the dynamic world looks like

Andrea Tagliasacchi ๐Ÿ‡จ๐Ÿ‡ฆ profil fotoฤŸrafฤฑ
Andrea Tagliasacchi ๐Ÿ‡จ๐Ÿ‡ฆ1 yฤฑl รถnce

A shout to the collaborators @isskoro @guocheng_qian A. Siarohin @willimenapace @SergeyTulyakov at Snap and @DaveLindell at UofT.

Samarth Sinha profil fotoฤŸrafฤฑ
Samarth Sinha1 yฤฑl รถnce

@sherwinbahmani Congrats @sherwinbahmani !!

Abdullah Hamdi profil fotoฤŸrafฤฑ
Abdullah Hamdi1 yฤฑl รถnce

@sherwinbahmani Congrats to the team ! Amazing work

Benzer Videolar