Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

FlexiFilm: Long Video Generation with Flexible Conditions (2404.18620v1)

Published 29 Apr 2024 in cs.CV

Abstract: Generating long and consistent videos has emerged as a significant yet challenging problem. While most existing diffusion-based video generation models, derived from image generation models, demonstrate promising performance in generating short videos, their simple conditioning mechanism and sampling strategy-originally designed for image generation-cause severe performance degradation when adapted to long video generation. This results in prominent temporal inconsistency and overexposure. Thus, in this work, we introduce FlexiFilm, a new diffusion model tailored for long video generation. Our framework incorporates a temporal conditioner to establish a more consistent relationship between generation and multi-modal conditions, and a resampling strategy to tackle overexposure. Empirical results demonstrate FlexiFilm generates long and consistent videos, each over 30 seconds in length, outperforming competitors in qualitative and quantitative analyses. Project page: https://y-ichen.github.io/FlexiFilm-Page/

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Yichen Ouyang (3 papers)
  2. Hao Zhao (139 papers)
  3. Gaoang Wang (68 papers)
  4. Bo Zhao (242 papers)
  5. Jianhao Yuan (10 papers)
Citations (4)

Summary

We haven't generated a summary for this paper yet.