Local Relation Learning for Face Forgery Detection (2105.02577v1)

Published 6 May 2021 in cs.CV

Abstract: With the rapid development of facial manipulation techniques, face forgery detection has received considerable attention in digital media forensics due to security concerns. Most existing methods formulate face forgery detection as a classification problem and utilize binary labels or manipulated region masks as supervision. However, without considering the correlation between local regions, these global supervisions are insufficient to learn a generalized feature and prone to overfitting. To address this issue, we propose a novel perspective of face forgery detection via local relation learning. Specifically, we propose a Multi-scale Patch Similarity Module (MPSM), which measures the similarity between features of local regions and forms a robust and generalized similarity pattern. Moreover, we propose an RGB-Frequency Attention Module (RFAM) to fuse information in both RGB and frequency domains for more comprehensive local feature representation, which further improves the reliability of the similarity pattern. Extensive experiments show that the proposed method consistently outperforms the state-of-the-arts on widely-used benchmarks. Furthermore, detailed visualization shows the robustness and interpretability of our method.

PDF Abstract

Local Relation Learning for Face Forgery Detection

The paper "Local Relation Learning for Face Forgery Detection" by Shen Chen et al. explores an innovative approach to the challenge of detecting face forgery, which is increasingly relevant given the sophistication of modern facial manipulation techniques such as Deepfakes and FaceSwap. Unlike traditional methods that treat face forgery detection as a binary classification problem and rely on global features, this paper introduces a more nuanced methodology centered on local relation learning.

The authors propose a Multi-scale Patch Similarity Module (MPSM) that captures the similarity patterns of local features between different regions of a facial image. This approach is informed by the observation that forged and real regions within an image exhibit distinct similarity characteristics. The MPSM assesses second-order relationships by measuring the pair-wise cosine similarity of features from various patches, facilitating a comprehensive description of where forgery artifacts may exist. To further enhance this local feature representation, the paper employs an RGB-Frequency Attention Module (RFAM). This module synergizes information extracted from both RGB and frequency domains, leveraging the Discrete Cosine Transform to emphasize high-frequency artifacts typically indicative of forgeries.

One of the commendable outcomes of this work is the demonstrated robustness of the proposed method against variations commonly found in manipulated content, such as different qualities of compression and noise. Extensive experiments conducted on datasets such as FaceForensics++ have shown that this approach consistently surpasses state-of-the-art methods. The paper reports an accuracy (ACC) of 99.87% on uncompressed raw videos and 91.47% on low-quality videos, along with notable Area Under the Curve (AUC) improvements across multiple benchmarks.

The implications of this research are considerable for both theoretical insights into forgery detection and practical applications. The attention given to local regions rather than a singular global descriptor offers a pathway to capturing finer-grained anomalies, a capability potentially expansible to other domains of digital forensics. Moreover, the use of frequency domain features in combination with spatial features suggests potential advancements in various image processing tasks, leveraging multi-domain data to improve detection accuracy and robustness.

Looking forward, there remains fertile ground for exploring how local relation learning can be integrated into broader multimedia forensic frameworks and real-time application scenarios. Future developments might also probe deeper into deploying these mechanisms in unsupervised or semi-supervised learning contexts, where the explicit labeled supervision used in this paper may not be readily available.

In summary, this work enhances face forgery detection by proposing a technique grounded in analyzing local patch relations augmented by frequency domain insights. This novel perspective offers both interpretability and robustness against image quality degradation and sets a promising foundation for future AI-based digital forensics innovations.

PDF Markdown Bookmark Chat (Pro)

Authors (6)

Shen Chen (29 papers)
Taiping Yao (40 papers)
Yang Chen (535 papers)
Shouhong Ding (90 papers)
Jilin Li (41 papers)
Rongrong Ji (315 papers)

Citations (227)

View on Semantic Scholar

Local Relation Learning for Face Forgery Detection (2105.02577v1)

Local Relation Learning for Face Forgery Detection

Related Papers