Загрузка видео...

Не удалось загрузить видео

На главную

Meta presents Video Editing via Factorized Diffusion Distillation We introduce Emu Video Edit (EVE), a model that establishes a new state-of-the art in video editing without relying on any supervised video editing data. To develop EVE we separately train an image editing

115,597 просмотров • 2 лет назад •via X (Twitter)

Комментарии: 7

Фото профиля AK
AK2 лет назад

adapter and a video generation adapter, and attach both to the same text-to-image model. Then, to align the adapters towards video editing we introduce a new unsupervised distillation procedure, Factorized Diffusion Distillation. This procedure distills knowledge from one or

Фото профиля AK
AK2 лет назад

more teachers simultaneously, without any supervised data. We utilize this procedure to teach EVE to edit videos by jointly distilling knowledge to (i) precisely edit each individual frame from the image editing adapter, and (ii) ensure temporal consistency among the

Фото профиля AK
AK2 лет назад

edited frames using the video generation adapter. Finally, to demonstrate the potential of our approach in unlocking other capabilities, we align additional combinations of adapters

Фото профиля AK
AK2 лет назад

paper page:

Фото профиля Uri Gil
Uri Gil2 лет назад

that is not what the term "video editing" usually refers to. It should be called video manipulation or something

Фото профиля Jing Gu
Jing Gu2 лет назад

Using two adapters to function for editing and video part. Good idea 👍

Фото профиля Simulacra Latens
Simulacra Latens2 лет назад

What is the edit? All I see is image swapping/IPAdapater style transfer which we already have?

Похожие видео

InstantDrag Improving Interactivity in Drag-based Image Editing discuss: Drag-based image editing has recently gained popularity for its interactivity and precision. However, despite the ability of text-to-image models to generate samples within a second, drag editing still lags behind due to the challenge of accurately reflecting user interaction while maintaining image content. Some existing approaches rely on computationally intensive per-image optimization or intricate guidance-based methods, requiring additional inputs such as masks for movable regions and text prompts, thereby compromising the interactivity of the editing process. We introduce InstantDrag, an optimization-free pipeline that enhances interactivity and speed, requiring only an image and a drag instruction as input. InstantDrag consists of two carefully designed networks: a drag-conditioned optical flow generator (FlowGen) and an optical flow-conditioned diffusion model (FlowDiffusion). InstantDrag learns motion dynamics for drag-based image editing in real-world video datasets by decomposing the task into motion generation and motion-conditioned image generation. We demonstrate InstantDrag's capability to perform fast, photo-realistic edits without masks or text prompts through experiments on facial video datasets and general scenes. These results highlight the efficiency of our approach in handling drag-based image editing, making it a promising solution for interactive, real-time applications.

AK

71,201 просмотров • 1 год назад