正在加载视频...

视频加载失败

加载此视频时出现问题。这可能是由于临时网络问题，或视频可能不可用。

Meta presents Video Editing via Factorized Diffusion Distillation We introduce Emu Video Edit (EVE), a model that establishes a new state-of-the art in video editing without relying on any supervised video editing data. To develop EVE we separately train an image editing

AK

504,352 subscribers

115,597 次观看 • 2 年前 •via X (Twitter)

Anya Rossi• Live Now

Private livecam show

7 条评论

AK 的头像

AK2 年前

adapter and a video generation adapter, and attach both to the same text-to-image model. Then, to align the adapters towards video editing we introduce a new unsupervised distillation procedure, Factorized Diffusion Distillation. This procedure distills knowledge from one or

AK 的头像

AK2 年前

more teachers simultaneously, without any supervised data. We utilize this procedure to teach EVE to edit videos by jointly distilling knowledge to (i) precisely edit each individual frame from the image editing adapter, and (ii) ensure temporal consistency among the

AK 的头像

AK2 年前

edited frames using the video generation adapter. Finally, to demonstrate the potential of our approach in unlocking other capabilities, we align additional combinations of adapters

AK 的头像

AK2 年前

paper page:

Uri Gil 的头像

Uri Gil2 年前

that is not what the term "video editing" usually refers to. It should be called video manipulation or something

Jing Gu 的头像

Jing Gu2 年前

Using two adapters to function for editing and video part. Good idea 👍

Simulacra Latens 的头像

Simulacra Latens2 年前

What is the edit? All I see is image swapping/IPAdapater style transfer which we already have?

相关视频

Today we’re sharing two new advances in our generative AI research: Emu Video & Emu Edit. Details ➡️ These new models deliver exciting results in high quality, diffusion-based text-to-video generation & controlled image editing w/ text instructions. 🧵

Today we’re sharing two new advances in our generative AI research: Emu Video & Emu Edit. Details ➡️ These new models deliver exciting results in high quality, diffusion-based text-to-video generation & controlled image editing w/ text instructions. 🧵

AI at Meta

798,183 次观看 • 2 年前

🕹️We are excited to introduce "ChronoEdit: Towards Temporal Reasoning for Image Editing and World Simulation" ChronoEdit reframes image editing as a video generation task to encourage temporal consistency. It leverages a temporal reasoning stage that denoises with “video reasoning tokens” to "reason" on physically plausible edits. See the attached video for results. Project Page: Arxiv: Code and model are coming.

🕹️We are excited to introduce "ChronoEdit: Towards Temporal Reasoning for Image Editing and World Simulation" ChronoEdit reframes image editing as a video generation task to encourage temporal consistency. It leverages a temporal reasoning stage that denoises with “video reasoning tokens” to "reason" on physically plausible edits. See the attached video for results. Project Page: Arxiv: Code and model are coming.

Huan Ling

36,835 次观看 • 8 个月前

TurboEdit Instant text-based image editing discuss: We address the challenges of precise image inversion and disentangled image editing in the context of few-step diffusion models. We introduce an encoder based iterative inversion technique. The inversion network is conditioned on the input image and the reconstructed image from the previous step, allowing for correction of the next reconstruction towards the input image. We demonstrate that disentangled controls can be easily achieved in the few-step diffusion model by conditioning on an (automatically generated) detailed text prompt. To manipulate the inverted image, we freeze the noise maps and modify one attribute in the text prompt (either manually or via instruction based editing driven by an LLM), resulting in the generation of a new image similar to the input image with only one attribute changed. It can further control the editing strength and accept instructive text prompt. Our approach facilitates realistic text-guided image edits in real-time, requiring only 8 number of functional evaluations (NFEs) in inversion (one-time cost) and 4 NFEs per edit. Our method is not only fast, but also significantly outperforms state-of-the-art multi-step diffusion editing techniques.

TurboEdit Instant text-based image editing discuss: We address the challenges of precise image inversion and disentangled image editing in the context of few-step diffusion models. We introduce an encoder based iterative inversion technique. The inversion network is conditioned on the input image and the reconstructed image from the previous step, allowing for correction of the next reconstruction towards the input image. We demonstrate that disentangled controls can be easily achieved in the few-step diffusion model by conditioning on an (automatically generated) detailed text prompt. To manipulate the inverted image, we freeze the noise maps and modify one attribute in the text prompt (either manually or via instruction based editing driven by an LLM), resulting in the generation of a new image similar to the input image with only one attribute changed. It can further control the editing strength and accept instructive text prompt. Our approach facilitates realistic text-guided image edits in real-time, requiring only 8 number of functional evaluations (NFEs) in inversion (one-time cost) and 4 NFEs per edit. Our method is not only fast, but also significantly outperforms state-of-the-art multi-step diffusion editing techniques.

AK

16,062 次观看 • 1 年前

We do a little video editing

We do a little video editing

Lee (Greater)

13,243 次观看 • 4 个月前

video editing is stuck in 2005, time for something new introducing diffusion the first infinite canvas for video and motion graphics like figma, but for editing

video editing is stuck in 2005, time for something new introducing diffusion the first infinite canvas for video and motion graphics like figma, but for editing

konstantinpaulus

128,617 次观看 • 1 个月前

Exciting milestones in our generative AI research: Emu Video, which lets you create high quality videos from a text prompt, and Emu Edit, which enables detailed image editing based on your instructions. These new models are built on Emu, our foundation model for image generation and technology from them will underpin new creative features across our apps next year. Try it out: Emu Video: Emu Edit:

Exciting milestones in our generative AI research: Emu Video, which lets you create high quality videos from a text prompt, and Emu Edit, which enables detailed image editing based on your instructions. These new models are built on Emu, our foundation model for image generation and technology from them will underpin new creative features across our apps next year. Try it out: Emu Video: Emu Edit:

Boz

110,720 次观看 • 2 年前

is this the end of video editing 🤯 i found AI-powered video editing tool that do most of my videos creation with ease.... it's cursor for video editing

is this the end of video editing 🤯 i found AI-powered video editing tool that do most of my videos creation with ease.... it's cursor for video editing

Fakhr

29,070 次观看 • 8 个月前

Grok Imagine API just released A world-class video generation + video editing model Text-to-Video: Turn simple prompts into rich video clips with audio Image Generation + Editing: Bring ideas to life with visuals from scratch Video Editing Tools: Restyle scenes, add/remove props, control motion Best-in-Class Quality + Low Latency: Designed to deliver fast, cost-efficient results API pricing: Image input: $0.002 Video input : $0.01 Video output : $0.05

Grok Imagine API just released A world-class video generation + video editing model Text-to-Video: Turn simple prompts into rich video clips with audio Image Generation + Editing: Bring ideas to life with visuals from scratch Video Editing Tools: Restyle scenes, add/remove props, control motion Best-in-Class Quality + Low Latency: Designed to deliver fast, cost-efficient results API pricing: Image input: $0.002 Video input : $0.01 Video output : $0.05

X Freeze

15,078 次观看 • 4 个月前

we built Cursor for video editing

we built Cursor for video editing

Timothy Wang

911,617 次观看 • 1 年前

Here’s what we are working on at Adobe Layered image editing 🤯 the future of AI image editing

Here’s what we are working on at Adobe Layered image editing 🤯 the future of AI image editing

Kris Kashtanova

59,819 次观看 • 7 个月前

InstantDrag Improving Interactivity in Drag-based Image Editing discuss: Drag-based image editing has recently gained popularity for its interactivity and precision. However, despite the ability of text-to-image models to generate samples within a second, drag editing still lags behind due to the challenge of accurately reflecting user interaction while maintaining image content. Some existing approaches rely on computationally intensive per-image optimization or intricate guidance-based methods, requiring additional inputs such as masks for movable regions and text prompts, thereby compromising the interactivity of the editing process. We introduce InstantDrag, an optimization-free pipeline that enhances interactivity and speed, requiring only an image and a drag instruction as input. InstantDrag consists of two carefully designed networks: a drag-conditioned optical flow generator (FlowGen) and an optical flow-conditioned diffusion model (FlowDiffusion). InstantDrag learns motion dynamics for drag-based image editing in real-world video datasets by decomposing the task into motion generation and motion-conditioned image generation. We demonstrate InstantDrag's capability to perform fast, photo-realistic edits without masks or text prompts through experiments on facial video datasets and general scenes. These results highlight the efficiency of our approach in handling drag-based image editing, making it a promising solution for interactive, real-time applications.

InstantDrag Improving Interactivity in Drag-based Image Editing discuss: Drag-based image editing has recently gained popularity for its interactivity and precision. However, despite the ability of text-to-image models to generate samples within a second, drag editing still lags behind due to the challenge of accurately reflecting user interaction while maintaining image content. Some existing approaches rely on computationally intensive per-image optimization or intricate guidance-based methods, requiring additional inputs such as masks for movable regions and text prompts, thereby compromising the interactivity of the editing process. We introduce InstantDrag, an optimization-free pipeline that enhances interactivity and speed, requiring only an image and a drag instruction as input. InstantDrag consists of two carefully designed networks: a drag-conditioned optical flow generator (FlowGen) and an optical flow-conditioned diffusion model (FlowDiffusion). InstantDrag learns motion dynamics for drag-based image editing in real-world video datasets by decomposing the task into motion generation and motion-conditioned image generation. We demonstrate InstantDrag's capability to perform fast, photo-realistic edits without masks or text prompts through experiments on facial video datasets and general scenes. These results highlight the efficiency of our approach in handling drag-based image editing, making it a promising solution for interactive, real-time applications.

AK

71,201 次观看 • 1 年前

Google announces Dreamix: a model that generates videos when given: - video + prompt (Video editing) - input images + prompt (Subject Driven Generation) - input image + prompt (Image-toVideo

Google announces Dreamix: a model that generates videos when given: - video + prompt (Video editing) - input images + prompt (Subject Driven Generation) - input image + prompt (Image-toVideo

bleedingedge.ai

1,323,744 次观看 • 3 年前

we just built git for video editing.

we just built git for video editing.

Lucas Jin

2,035,854 次观看 • 2 个月前

Introducing Higgsfield Canvas: a state-of-the-art image editing model. Paint products directly onto your image with pixel-perfect control. Say hi to your new go-to for product placement, editing, and layout! 👋🏻 Comment Canvas to get the full guide in the DM.

Introducing Higgsfield Canvas: a state-of-the-art image editing model. Paint products directly onto your image with pixel-perfect control. Say hi to your new go-to for product placement, editing, and layout! 👋🏻 Comment Canvas to get the full guide in the DM.

Higgsfield AI 🧩

2,627,133 次观看 • 1 年前

Bytedance drops an open-source Gemini Omni!!! Bernini is a new AI video generation + editing framework. > Edit videos with text prompts > Image/video references > Code available

Bytedance drops an open-source Gemini Omni!!! Bernini is a new AI video generation + editing framework. > Edit videos with text prompts > Image/video references > Code available

⚡AI Search⚡

43,132 次观看 • 13 天前

A video editing tool, made by a succesful YouTuber. We love to see it. 👏

A video editing tool, made by a succesful YouTuber. We love to see it. 👏

Product Hunt 😸

12,113 次观看 • 2 年前

Introducing ChatGPT Images 2.0 A state-of-the-art image model that can take on complex visual tasks and produce precise, immediately usable visuals, with sharper editing, richer layouts, and thinking-level intelligence. Video made with ChatGPT Images

Introducing ChatGPT Images 2.0 A state-of-the-art image model that can take on complex visual tasks and produce precise, immediately usable visuals, with sharper editing, richer layouts, and thinking-level intelligence. Video made with ChatGPT Images

OpenAI

12,862,019 次观看 • 1 个月前

The editing of this video 👌

The editing of this video 👌

Interesting AF

725,367 次观看 • 10 个月前

We built Cursor for video editing (and its live)

We built Cursor for video editing (and its live)

Sabba Keynejad

160,946 次观看 • 1 年前