Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
134 tokens/sec
GPT-4o
10 tokens/sec
Gemini 2.5 Pro Pro
47 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

XAI-Based Detection of Adversarial Attacks on Deepfake Detectors (2403.02955v2)

Published 5 Mar 2024 in cs.CR and cs.CV

Abstract: We introduce a novel methodology for identifying adversarial attacks on deepfake detectors using eXplainable Artificial Intelligence (XAI). In an era characterized by digital advancement, deepfakes have emerged as a potent tool, creating a demand for efficient detection systems. However, these systems are frequently targeted by adversarial attacks that inhibit their performance. We address this gap, developing a defensible deepfake detector by leveraging the power of XAI. The proposed methodology uses XAI to generate interpretability maps for a given method, providing explicit visualizations of decision-making factors within the AI models. We subsequently employ a pretrained feature extractor that processes both the input image and its corresponding XAI image. The feature embeddings extracted from this process are then used for training a simple yet effective classifier. Our approach contributes not only to the detection of deepfakes but also enhances the understanding of possible adversarial attacks, pinpointing potential vulnerabilities. Furthermore, this approach does not change the performance of the deepfake detector. The paper demonstrates promising results suggesting a potential pathway for future deepfake detection mechanisms. We believe this study will serve as a valuable contribution to the community, sparking much-needed discourse on safeguarding deepfake detectors.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (59)
  1. Peeking inside the black-box: A survey on explainable artificial intelligence (xai). IEEE Access, 6:52138–52160, 2018. doi: 10.1109/ACCESS.2018.2870052.
  2. Square attack: a query-efficient black-box adversarial attack via random search. In European conference on computer vision, pp.  484–501. Springer, 2020.
  3. Explainable artificial intelligence: an analytical review. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 11(5):e1424, 2021.
  4. Adversarial attacks and defenses in explainable artificial intelligence: A survey. Information Fusion, pp.  102303, 2024.
  5. Synthetic and natural noise both break neural machine translation, 2018.
  6. Video face manipulation detection through ensemble of cnns. In 2020 25th international conference on pattern recognition (ICPR), pp.  5012–5019. IEEE, 2021.
  7. Towards evaluating the robustness of neural networks, 2017a.
  8. Towards evaluating the robustness of neural networks. In 2017 ieee symposium on security and privacy (sp), pp.  39–57. Ieee, 2017b.
  9. Audio adversarial examples: Targeted attacks on speech-to-text, 2018.
  10. Hopskipjumpattack: A query-efficient decision-based attack. In 2020 ieee symposium on security and privacy (sp), pp.  1277–1294. IEEE, 2020.
  11. Magdr: Mask-guided detection and reconstruction for defending deepfakes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  9014–9023, 2021.
  12. François Chollet. Xception: Deep learning with depthwise separable convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.  1251–1258, 2017.
  13. Certified adversarial robustness via randomized smoothing, 2019.
  14. A historical perspective of explainable artificial intelligence. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 11(1):e1391, 2021.
  15. Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. In International conference on machine learning, pp.  2206–2216. PMLR, 2020.
  16. The deepfake detection challenge (dfdc) dataset, 2020.
  17. Explainable artificial intelligence: A survey. In 2018 41st International convention on information and communication technology, electronics and microelectronics (MIPRO), pp.  0210–0215. IEEE, 2018.
  18. Hotflip: White-box adversarial examples for text classification, 2018.
  19. Physical adversarial examples for object detectors, 2018.
  20. Deepfake audio detection and justification with explainable artificial intelligence (xai). 2023.
  21. An adversarial attack approach for explainable ai evaluation on deepfake detection models. Computers and Security, 139:103684, 2024. ISSN 0167-4048. doi: https://doi.org/10.1016/j.cose.2023.103684. URL https://www.sciencedirect.com/science/article/pii/S0167404823005941.
  22. Deepfake detection by analyzing convolutional traces. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pp.  666–667, 2020a.
  23. Preliminary forensics analysis of deepfake images. In 2020 AEIT international annual conference (AEIT), pp.  1–6. IEEE, 2020b.
  24. TEASPN: Framework and protocol for integrated writing assistance environments. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP): System Demonstrations, pp.  229–234, Hong Kong, China, November 2019. Association for Computational Linguistics. doi: 10.18653/v1/D19-3039. URL https://aclanthology.org/D19-3039.
  25. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.  770–778, 2016.
  26. Evading deepfake detectors via adversarial statistical consistency. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  12271–12280, 2023.
  27. Adversarial deepfakes: Evaluating vulnerability of deepfake detectors to adversarial examples, 2020.
  28. Exposing vulnerabilities of deepfake detection systems with robust attacks. Digital Threats: Research and Practice (DTRAP), 3(3):1–23, 2022.
  29. How deep are the fakes? focusing on audio deepfake: A survey. arXiv preprint arXiv:2111.14203, 2021.
  30. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
  31. Patch of invisibility: Naturalistic black-box adversarial attacks on object detectors. arXiv preprint arXiv:2303.04238, 2023a.
  32. I see dead people: Gray-box adversarial attack on image-to-text models. arXiv preprint arXiv:2306.07591, 2023b.
  33. An evolutionary, gradient-free, query-efficient, black-box algorithm for generating adversarial instances in deep convolutional neural networks. Algorithms, 15(11):407, 2022.
  34. Open sesame! universal black box jailbreaking of large language models. arXiv preprint arXiv:2309.01446, 2023.
  35. Quality-agnostic deepfake detection with intra-model collaborative learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp.  22378–22389, 2023.
  36. Robust deepfake on unrestricted media: Generation and detection, 2022.
  37. Exposing deepfake videos by detecting face warping artifacts, 2019.
  38. Luochen Lv. Smart watermark to defend against deepfake image manipulation. In 2021 IEEE 6th International Conference on Computer and Communication Systems (ICCCS), pp.  380–384, 2021. doi: 10.1109/ICCCS52626.2021.9449287.
  39. Siwei Lyu. Deepfake detection: Current challenges and next steps. In 2020 IEEE international conference on multimedia & expo workshops (ICMEW), pp.  1–6. IEEE, 2020.
  40. Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083, 2017.
  41. Towards deep learning models resistant to adversarial attacks, 2019.
  42. Universal adversarial perturbations, 2017.
  43. Universal adversarial perturbations for speech recognition systems, 2019.
  44. The limitations of deep learning in adversarial settings, 2015.
  45. Practical black-box attacks against machine learning, 2017.
  46. Imperceptible, robust, and targeted adversarial examples for automatic speech recognition, 2019.
  47. Black-box adversarial attacks using evolution strategies. In Proceedings of the Genetic and Evolutionary Computation Conference Companion, pp.  1827–1833, 2021.
  48. Deepfake detection: A systematic literature review. IEEE access, 10:25494–25513, 2022.
  49. Faceforensics++: Learning to detect manipulated facial images. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2019.
  50. Towards explainable artificial intelligence. Explainable AI: interpreting, explaining and visualizing deep learning, pp.  5–22, 2019.
  51. Curls and whey: Boosting black-box adversarial attacks, 2019.
  52. Not just a black box: Learning important features through propagating activation differences. arXiv preprint arXiv:1605.01713, 2016.
  53. Deep inside convolutional networks: Visualising image classification models and saliency maps. arXiv preprint arXiv:1312.6034, 2013.
  54. Striving for simplicity: The all convolutional net. arXiv preprint arXiv:1412.6806, 2014.
  55. Axiomatic attribution for deep networks, 2017.
  56. Intriguing properties of neural networks, 2014.
  57. Inception-v4, inception-resnet and the impact of residual connections on learning. In Proceedings of the AAAI conference on artificial intelligence, volume 31, 2017.
  58. Foiling explanations in deep neural networks. Transactions on Machine Learning Research, 2023. ISSN 2835-8856. URL https://openreview.net/forum?id=wvLQMHtyLk.
  59. Mika Westerlund. The emergence of deepfake technology: A review. Technology innovation management review, 9(11), 2019.
Citations (5)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com