Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Audio-driven Neural Gesture Reenactment with Video Motion Graphs (2207.11524v1)

Published 23 Jul 2022 in cs.CV

Abstract: Human speech is often accompanied by body gestures including arm and hand gestures. We present a method that reenacts a high-quality video with gestures matching a target speech audio. The key idea of our method is to split and re-assemble clips from a reference video through a novel video motion graph encoding valid transitions between clips. To seamlessly connect different clips in the reenactment, we propose a pose-aware video blending network which synthesizes video frames around the stitched frames between two clips. Moreover, we developed an audio-based gesture searching algorithm to find the optimal order of the reenacted frames. Our system generates reenactments that are consistent with both the audio rhythms and the speech content. We evaluate our synthesized video quality quantitatively, qualitatively, and with user studies, demonstrating that our method produces videos of much higher quality and consistency with the target audio compared to previous work and baselines.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Yang Zhou (311 papers)
  2. Jimei Yang (58 papers)
  3. Dingzeyu Li (18 papers)
  4. Jun Saito (22 papers)
  5. Deepali Aneja (10 papers)
  6. Evangelos Kalogerakis (44 papers)
Citations (15)

Summary

We haven't generated a summary for this paper yet.