Emma
Summary:
-
Large scale text to image generation models are being used for editing videos in concurrent work.
-
Some of these methods include Video-P2P, Fate-Zero, Tune-A-Video, and Gen-1.
Key terms:
-
Large scale text to image generation models: Models that generate images from text and are used for image editing tasks
-
Video-P2P: A method that extends null inversion to video clips and adapts a cross attention control mechanism
-
Fate-Zero: A training-free strategy for editing videos that utilizes cross attention maps to compute blending masks
-
Tune-A-Video: A method that finetunes the image generation model given an input video and uses cross-frame attention for consistent edits
-
Gen-1: A large scale video generation model that uses depth as a structural cue and is trained on a mixed dataset of images and videos
Tags:
Research
arXiv
Video Editing
Image Diffusion
Concurrent Work
Video-P2P
Fate-Zero
Tune-A-Video
Gen-1
Large Scale Text To Image Generation