Loading video...
Video Failed to Load
Meta presents Video Editing via Factorized Diffusion Distillation We introduce Emu Video Edit (EVE), a model that establishes a new state-of-the art in video editing without relying on any supervised video editing data. To develop EVE we separately train an image editing
115,594 views • 2 years ago •via X (Twitter)
7 Comments

adapter and a video generation adapter, and attach both to the same text-to-image model. Then, to align the adapters towards video editing we introduce a new unsupervised distillation procedure, Factorized Diffusion Distillation. This procedure distills knowledge from one or

more teachers simultaneously, without any supervised data. We utilize this procedure to teach EVE to edit videos by jointly distilling knowledge to (i) precisely edit each individual frame from the image editing adapter, and (ii) ensure temporal consistency among the

edited frames using the video generation adapter. Finally, to demonstrate the potential of our approach in unlocking other capabilities, we align additional combinations of adapters

paper page:

that is not what the term "video editing" usually refers to. It should be called video manipulation or something

Using two adapters to function for editing and video part. Good idea 👍

What is the edit? All I see is image swapping/IPAdapater style transfer which we already have?
