CAE-Net: Generalized Deepfake Image Detection using Convolution and Attention Mechanisms with Spatial and Frequency Domain Features (2502.10682v2)

Published 15 Feb 2025 in cs.CV, cs.LG, and eess.IV

Abstract: Effective deepfake detection tools are becoming increasingly essential to the growing usage of deepfakes in unethical practices. There exists a wide range of deepfake generation techniques, which makes it challenging to develop an accurate universal detection mechanism. The 2025 IEEE Signal Processing Cup (\textit{DFWild-Cup} competition) provided a diverse dataset of deepfake images containing significant class imbalance. The images in the dataset are generated from multiple deepfake image generators, for training machine learning model(s) to emphasize the generalization of deepfake detection. To this end, we proposed a disjoint set-based multistage training method to address the class imbalance and devised an ensemble-based architecture \emph{CAE-Net}. Our architecture consists of a convolution- and attention-based ensemble network, and employs three different neural network architectures: EfficientNet, Data-Efficient Image Transformer (DeiT), and ConvNeXt with wavelet transform to capture both local and global features of deepfakes. We visualize the specific regions that these models focus on for classification using Grad-CAM, and empirically demonstrate the effectiveness of these models in grouping real and fake images into cohesive clusters using t-SNE plots. Individually, the EfficientNet B0 architecture has achieved 90.79\% accuracy, whereas the ConvNeXt and the DeiT architecture have achieved 89.49\% and 89.32\% accuracy, respectively. With these networks, our weighted ensemble model achieves an excellent accuracy of 94.63\% on the validation dataset of the SP Cup 2025 competition. The equal error rate of 4.72\% and the Area Under the ROC curve of 97.37\% further confirm the stability of our proposed method. Finally, the robustness of our proposed model against adversarial perturbation attacks is tested as well, showing the inherent defensive properties of the ensemble approach.

View on arXiv

Authors (7)

Kafi Anan (1 paper)
Anindya Bhattacharjee (1 paper)
Ashir Intesher (1 paper)
Kaidul Islam (1 paper)
Abrar Assaeem Fuad (1 paper)
Utsab Saha (5 papers)
Hafiz Imtiaz (11 papers)

Summary

We haven't generated a summary for this paper yet.

Summarize Now

CAE-Net: Generalized Deepfake Image Detection using Convolution and Attention Mechanisms with Spatial and Frequency Domain Features (2502.10682v2)

Summary

Related Papers