Exposing DeepFake Videos By Detecting Face Warping Artifacts (1811.00656v3)

Published 1 Nov 2018 in cs.CV

Abstract: In this work, we describe a new deep learning based method that can effectively distinguish AI-generated fake videos (referred to as {\em DeepFake} videos hereafter) from real videos. Our method is based on the observations that current DeepFake algorithm can only generate images of limited resolutions, which need to be further warped to match the original faces in the source video. Such transforms leave distinctive artifacts in the resulting DeepFake videos, and we show that they can be effectively captured by convolutional neural networks (CNNs). Compared to previous methods which use a large amount of real and DeepFake generated images to train CNN classifier, our method does not need DeepFake generated images as negative training examples since we target the artifacts in affine face warping as the distinctive feature to distinguish real and fake images. The advantages of our method are two-fold: (1) Such artifacts can be simulated directly using simple image processing operations on a image to make it as negative example. Since training a DeepFake model to generate negative examples is time-consuming and resource-demanding, our method saves a plenty of time and resources in training data collection; (2) Since such artifacts are general existed in DeepFake videos from different sources, our method is more robust compared to others. Our method is evaluated on two sets of DeepFake video datasets for its effectiveness in practice.

PDF Abstract

Exposing DeepFake Videos by Detecting Face Warping Artifacts

The paper authored by Yuezun Li and Siwei Lyu, titled "Exposing DeepFake Videos By Detecting Face Warping Artifacts," presents a deep learning-based approach to differentiate DeepFake videos from authentic ones. This research builds on the premise that existing DeepFake algorithms generate images of limited resolutions that require affine transformations to match the faces in the original video, creating detectable artifacts. The authors exploit these artifacts using convolutional neural networks (CNNs) to develop a more efficient and generalized detection method.

Methodology

The core idea rests on identifying artifacts introduced during the face warping process in DeepFake generation pipelines. This method has two primary advantages:

Data Generation Efficiency: Unlike prior approaches that necessitate a large corpus of DeepFake generated images for training, the proposed method simulates warping artifacts using simple image processing techniques. This approach starkly contrasts with training intensive DeepFake models, resulting in significant savings in time and computational resources.
Robustness: By targeting general artifacts, the method achieves robustness across different DeepFake video sources, mitigating the risk of overfitting to a specific DeepFake distribution.

The training pipeline involves the application of affine transformations to real face images to simulate the resolution inconsistencies indicative of DeepFake artifacts. Four CNN architectures—VGG16, ResNet50, ResNet101, and ResNet152—are employed for evaluation, leveraging standard image processing tools and dynamic generation of negative training examples to ensure diverse and comprehensive training data.

Experimental Results

The method was validated on two datasets: UADFV and DeepfakeTIMIT, which are known benchmarks for DeepFake detection:

UADFV Dataset:
- Image-based Evaluation: The ResNet50 model achieved the highest performance with an AUC of 97.4%, outperforming VGG16, ResNet101, and ResNet152.
- Video-based Evaluation: Again, the ResNet50 model exhibited superior performance with an AUC of 98.7%.
DeepfakeTIMIT Dataset:
- Low Quality (LQ) Videos: ResNet50 achieved an AUC of 99.9%.
- High Quality (HQ) Videos: ResNet50 marked an AUC of 93.2%, outperforming other models significantly, despite the increased challenge posed by higher-quality forgeries.

These results highlight the efficacy of the proposed method in detecting DeepFakes with high accuracy across various quality settings and video sources.

Implications and Future Directions

The paper’s approach presents a significant stride in DeepFake detection research by focusing on facial warping artifacts that are ubiquitous across DeepFake algorithms. The robust performance across datasets demonstrates the potential for real-world application, especially in scenarios requiring quick turnaround times and limited computational resources.

Potential future developments include:

Robustness Evaluation: Expanding evaluations to include multiple stages of video compression and degradation to assess the method’s reliability in diverse, real-world scenarios.
Dedicated Network Architectures: Designing specialized network architectures optimized explicitly for artifact detection, which could offer performance gains over standard networks like ResNet and VGG.

Overall, this work contributes a valuable technique to the arsenal of tools available for combating misinformation and digital forgeries, addressing an increasing societal concern regarding the authenticity of digital media.

PDF Markdown Bookmark Chat (Pro)

Authors (2)

Yuezun Li (37 papers)
Siwei Lyu (125 papers)

Citations (827)

View on Semantic Scholar

Exposing DeepFake Videos By Detecting Face Warping Artifacts (1811.00656v3)