Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Audio-driven High-resolution Seamless Talking Head Video Editing via StyleGAN (2407.05577v1)

Published 8 Jul 2024 in cs.CV

Abstract: The existing methods for audio-driven talking head video editing have the limitations of poor visual effects. This paper tries to tackle this problem through editing talking face images seamless with different emotions based on two modules: (1) an audio-to-landmark module, consisting of the CrossReconstructed Emotion Disentanglement and an alignment network module. It bridges the gap between speech and facial motions by predicting corresponding emotional landmarks from speech; (2) a landmark-based editing module edits face videos via StyleGAN. It aims to generate the seamless edited video consisting of the emotion and content components from the input audio. Extensive experiments confirm that compared with state-of-the-arts methods, our method provides high-resolution videos with high visual quality.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Jiacheng Su (1 paper)
  2. Kunhong Liu (6 papers)
  3. Liyan Chen (17 papers)
  4. Junfeng Yao (17 papers)
  5. Qingsong Liu (7 papers)
  6. Dongdong Lv (1 paper)
Citations (1)

Summary

We haven't generated a summary for this paper yet.