Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Image Conductor: Precision Control for Interactive Video Synthesis (2406.15339v1)

Published 21 Jun 2024 in cs.CV, cs.AI, and cs.MM

Abstract: Filmmaking and animation production often require sophisticated techniques for coordinating camera transitions and object movements, typically involving labor-intensive real-world capturing. Despite advancements in generative AI for video creation, achieving precise control over motion for interactive video asset generation remains challenging. To this end, we propose Image Conductor, a method for precise control of camera transitions and object movements to generate video assets from a single image. An well-cultivated training strategy is proposed to separate distinct camera and object motion by camera LoRA weights and object LoRA weights. To further address cinematographic variations from ill-posed trajectories, we introduce a camera-free guidance technique during inference, enhancing object movements while eliminating camera transitions. Additionally, we develop a trajectory-oriented video motion data curation pipeline for training. Quantitative and qualitative experiments demonstrate our method's precision and fine-grained control in generating motion-controllable videos from images, advancing the practical application of interactive video synthesis. Project webpage available at https://liyaowei-stu.github.io/project/ImageConductor/

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Yaowei Li (23 papers)
  2. Xintao Wang (132 papers)
  3. Zhaoyang Zhang (273 papers)
  4. Zhouxia Wang (16 papers)
  5. Ziyang Yuan (27 papers)
  6. Liangbin Xie (17 papers)
  7. Yuexian Zou (119 papers)
  8. Ying Shan (252 papers)
Citations (8)
Github Logo Streamline Icon: https://streamlinehq.com