Frequency-aware Discriminative Feature Learning Supervised by Single-Center Loss for Face Forgery Detection (2103.09096v1)

Published 16 Mar 2021 in cs.CV

Abstract: Face forgery detection is raising ever-increasing interest in computer vision since facial manipulation technologies cause serious worries. Though recent works have reached sound achievements, there are still unignorable problems: a) learned features supervised by softmax loss are separable but not discriminative enough, since softmax loss does not explicitly encourage intra-class compactness and interclass separability; and b) fixed filter banks and hand-crafted features are insufficient to capture forgery patterns of frequency from diverse inputs. To compensate for such limitations, a novel frequency-aware discriminative feature learning framework is proposed in this paper. Specifically, we design a novel single-center loss (SCL) that only compresses intra-class variations of natural faces while boosting inter-class differences in the embedding space. In such a case, the network can learn more discriminative features with less optimization difficulty. Besides, an adaptive frequency feature generation module is developed to mine frequency clues in a completely data-driven fashion. With the above two modules, the whole framework can learn more discriminative features in an end-to-end manner. Extensive experiments demonstrate the effectiveness and superiority of our framework on three versions of the FF++ dataset.

Authors (5)

Jiaming Li (45 papers)
Hongtao Xie (48 papers)
Jiahong Li (17 papers)
Zhongyuan Wang (105 papers)
Yongdong Zhang (119 papers)

Citations (202)

View on Semantic Scholar

Summary

Frequency-aware Discriminative Feature Learning for Face Forgery Detection

The increasing sophistication of facial manipulation technologies poses significant challenges to digital forensics and the broader research community. The paper "Frequency-aware Discriminative Feature Learning Supervised by Single-Center Loss for Face Forgery Detection" proposes a novel framework to address two key limitations in current face forgery detection methods: the inadequacy of softmax loss in generating discriminative features and the restrictive nature of fixed filter banks used in frequency domain analysis.

Key Contributions and Methodology

The authors introduce a Frequency-aware Discriminative Feature Learning (FDFL) framework, which includes two pivotal components—Single-Center Loss (SCL) and an Adaptive Frequency Feature Generation Module (AFFGM). These components are designed to enhance the discriminative power of features learned by the network and to adaptively extract frequency clues in a data-driven fashion.

Single-Center Loss (SCL): SCL addresses the limitations of softmax loss and other metric learning approaches like triplet and center losses by focusing solely on the intra-class compactness of natural faces and amplifying the inter-class separability between natural and manipulated faces. This selective compactness strategy acknowledges the diverse distributions of manipulated face features due to varied GAN fingerprints and manipulation methods, allowing the network to learn with reduced optimization complexity.
Adaptive Frequency Feature Generation Module (AFFGM): Unlike traditional approaches that rely on preset filter banks, AFFGM processes frequency information dynamically, retaining the spatial relationship of frequency components and employing a data-driven mining technique to capture subtle forgery clues. This module capitalizes on differences in frequency domain patterns, particularly in middle and high bands, to improve forgery detection efficacy.

Experimental Results and Implications

The authors conducted extensive experiments on three versions of the FaceForensics++ (FF++) dataset, demonstrating that their proposed framework achieves superior performance over existing methods. Notably, FDFL attained state-of-the-art results, particularly on the challenging c40 version of the FF++ dataset, where significant improvements were observed in both AUC and pAUC at low false alarm rates.

The use of SCL proved effective in creating a more discriminative embedding space, confirmed by t-SNE visualizations that showed cohesive clustering of natural faces away from manipulated ones. This reflects the benefits of a metric learning approach tailored to the peculiarities of face forgery data distributions.

Future Directions

The paper highlights the need for further research into generalizing detection methods to unseen manipulation techniques, a known limitation of supervised approaches. Future work could explore integrating semi-supervised and unsupervised learning techniques to improve generalization. Additionally, leveraging temporal inconsistency in video data could enhance the robustness of future detection frameworks. The adaptability of the SCL and AFFGM could also be applied to other fields, such as face anti-spoofing, offering promising potential for broader applications in digital media forensics.

PDF Markdown