CNN Detection of GAN-Generated Face Images based on Cross-Band Co-occurrences Analysis (2007.12909v2)

Published 25 Jul 2020 in cs.CV, cs.CR, cs.LG, and eess.IV

Abstract: Last-generation GAN models allow to generate synthetic images which are visually indistinguishable from natural ones, raising the need to develop tools to distinguish fake and natural images thus contributing to preserve the trustworthiness of digital images. While modern GAN models can generate very high-quality images with no visible spatial artifacts, reconstruction of consistent relationships among colour channels is expectedly more difficult. In this paper, we propose a method for distinguishing GAN-generated from natural images by exploiting inconsistencies among spectral bands, with specific focus on the generation of synthetic face images. Specifically, we use cross-band co-occurrence matrices, in addition to spatial co-occurrence matrices, as input to a CNN model, which is trained to distinguish between real and synthetic faces. The results of our experiments confirm the goodness of our approach which outperforms a similar detection technique based on intra-band spatial co-occurrences only. The performance gain is particularly significant with regard to robustness against post-processing, like geometric transformations, filtering and contrast manipulations.

PDF Abstract

CNN Detection of GAN-Generated Face Images based on Cross-Band Co-occurrences Analysis

The paper introduces a novel method for the detection of Generative Adversarial Network (GAN)-generated face images utilizing the analysis of cross-band co-occurrence matrices as inputs to a Convolutional Neural Network (CNN). The focus is on exploiting inconsistencies among color channels to differentiate between authentic and artificially generated images. By extending previous methodologies, which primarily relied on spatial co-occurrence matrices within individual color bands, the proposed approach aims to enhance detection accuracy and robustness, particularly against post-processing techniques such as resizing, noise addition, and contrast adjustments.

Methodology and Results

This research leverages the fact that modern GANs, like StyleGAN2, can produce visually indistinguishable images from real ones, eliminating or minimizing spatial discrepancies. Nonetheless, the accurate reconstruction of relationships among color channels remains challenging. The CNN model, named Cross-CoNet in this paper, is trained using both spatial co-occurrence matrices and inter-channel co-occurrence matrices derived from images' color bands—namely, the cross-band matrices RG, RB, and GB.

Experiments conducted validate that Cross-CoNet outperforms prior detection techniques which use only intra-band spatial co-occurrence matrices, showing increased robustness to geometric transformations, filtering operations, and contrast manipulations. This robustness is paramount since most existing forensic methodologies degrade significantly when manipulated media undergo common post-processing operations.

Numerical Performance Insights

The Cross-CoNet model achieves near-perfect detection accuracy for unaltered StyleGAN2 images, at 99.70%. Its strength, however, is demonstrated in its robustness across a spectrum of post-processing adjustments. For instance, where alternative methods can show accuracies as low as 50%, Cross-CoNet maintains significantly higher performance, typically above 75% in challenging cases such as adaptive histogram equalization and blurring followed by sharpening. Notably, the paper also presents an extended version of Cross-CoNet, trained on compressed images to address vulnerabilities to JPEG compression artifacts—known to compromise detection accuracy in baseline models. This JPEG-aware Cross-CoNet model demonstrates considerable resilience and adaptability under both matched and mismatched JPEG compression quality factors.

Theoretical and Practical Implications

The proposed method advances the field of digital image forensics by emphasizing the importance of inter-channel analysis for the detection of fake media content generated by sophisticated GANs. The findings suggest that future developments in image forensic tools should prioritize algorithms capable of analyzing both spatial and spectral inconsistencies to enhance robustness. Moreover, this approach holds significant potential for automated systems aimed at validating the authenticity of digital content in various applications, such as social media, journalism, and law enforcement.

Future Research Directions

Potential extensions of this work include defenses against informed adversaries employing adversarial attacks and further exploration of generalization capabilities across unknown datasets without retraining. Furthermore, evaluating the response of the proposed method to a print-and-scan attack offers another intriguing facet for investigation, ensuring resilience against practical attack scenarios aimed at evading detection systems.

In summary, the paper presents a compelling contribution to digital forensics, highlighting the necessity for inter-channel feature analysis in discerning GAN-generated imagery. The robustness afforded by the cross-band co-occurrence approach sets a promising standard for the future of synthetic media detection technologies.

PDF Markdown Bookmark Chat (Pro)

Authors (4)

Mauro Barni (56 papers)
Kassem Kallas (10 papers)
Ehsan Nowroozi (19 papers)
Benedetta Tondi (43 papers)

Citations (68)

View on Semantic Scholar

Related Papers

Find Related Papers

YouTube

Show All Videos