End-to-End Speech-Driven Facial Animation with Temporal GANs

Published 23 May 2018 in eess.AS, cs.CV, cs.SD, and eess.IV | (1805.09313v4)

Abstract: Speech-driven facial animation is the process which uses speech signals to automatically synthesize a talking character. The majority of work in this domain creates a mapping from audio features to visual features. This often requires post-processing using computer graphics techniques to produce realistic albeit subject dependent results. We present a system for generating videos of a talking head, using a still image of a person and an audio clip containing speech, that doesn't rely on any handcrafted intermediate features. To the best of our knowledge, this is the first method capable of generating subject independent realistic videos directly from raw audio. Our method can generate videos which have (a) lip movements that are in sync with the audio and (b) natural facial expressions such as blinks and eyebrow movements. We achieve this by using a temporal GAN with 2 discriminators, which are capable of capturing different aspects of the video. The effect of each component in our system is quantified through an ablation study. The generated videos are evaluated based on their sharpness, reconstruction quality, and lip-reading accuracy. Finally, a user study is conducted, confirming that temporal GANs lead to more natural sequences than a static GAN-based approach.

Abstract PDF Upgrade to Chat

Citations (102)

View on Semantic Scholar

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Paper Prompts

Top Community Prompts

Explain it Like I'm 14

off on

Knowledge Gaps

off on

Glossary

off on

Practical Applications

off on

Conceptual Simplification

off on

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Generate Now

Continue Learning

We haven't generated follow-up questions for this paper yet.

Generate Now

End-to-End Speech-Driven Facial Animation with Temporal GANs

Summary

Paper to Video (Beta)

Whiteboard

Paper Prompts

Top Community Prompts

Open Problems

Continue Learning

Authors (3)

Collections

End-to-End Speech-Driven Facial Animation with Temporal GANs

Summary

Paper to Video (Beta)

Whiteboard

Paper Prompts

Top Community Prompts

Open Problems

Continue Learning

Related Papers

Authors (3)

Collections