Papers
Topics
Authors
Recent
Search
2000 character limit reached

Detecting AutoAttack Perturbations in the Frequency Domain

Published 16 Nov 2021 in cs.CV and cs.CR | (2111.08785v3)

Abstract: Recently, adversarial attacks on image classification networks by the AutoAttack (Croce and Hein, 2020b) framework have drawn a lot of attention. While AutoAttack has shown a very high attack success rate, most defense approaches are focusing on network hardening and robustness enhancements, like adversarial training. This way, the currently best-reported method can withstand about 66% of adversarial examples on CIFAR10. In this paper, we investigate the spatial and frequency domain properties of AutoAttack and propose an alternative defense. Instead of hardening a network, we detect adversarial attacks during inference, rejecting manipulated inputs. Based on a rather simple and fast analysis in the frequency domain, we introduce two different detection algorithms. First, a black box detector that only operates on the input images and achieves a detection accuracy of 100% on the AutoAttack CIFAR10 benchmark and 99.3% on ImageNet, for epsilon = 8/255 in both cases. Second, a whitebox detector using an analysis of CNN feature maps, leading to a detection rate of also 100% and 98.7% on the same benchmarks.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (27)
  1. Square attack: a query-efficient black-box adversarial attack via random search. In ECCV, 2020.
  2. Dimensionality reduction as a defense against evasion attacks on machine learning classifiers. arXiv, abs/1704.02654, 2017.
  3. Detecting adversarial samples using influence functions and nearest neighbors. CVPR, pp.  14441–14450, 2020.
  4. An algorithm for the machine calculation of complex fourier series. Mathematics of Computation, 19:297–301, 1965.
  5. Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. In ICML, 2020a.
  6. Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. In ICML, 2020b.
  7. Minimally distorted adversarial examples with a fast adaptive boundary attack. In ICML, 2020c.
  8. Robustbench: a standardized adversarial robustness benchmark. arXiv preprint arXiv:2010.09670, 2020.
  9. advertorch v0.1: An adversarial robustness toolbox based on pytorch. arXiv, abs/1902.07623, 2019.
  10. Detecting adversarial samples from artifacts. arXiv, abs/1703.00410, 2017.
  11. Explaining and harnessing adversarial examples. CoRR, abs/1412.6572, 2015.
  12. On the (statistical) detection of adversarial examples. arXiv, abs/1702.06280, 2017.
  13. Spectraldefense: Detecting adversarial attacks on cnns in the fourier domain. arXiv preprint arXiv:2103.03000, 2021.
  14. Early methods for detecting adversarial images. arXiv: Learning, 2017.
  15. A simple unified framework for detecting out-of-distribution samples and adversarial attacks. In NeurIPS, 2018.
  16. Adversarial examples detection in deep networks with convolutional filter statistics. In ICCV, pp.  5775–5783, 2017. doi: 10.1109/ICCV.2017.615.
  17. Detection based defense against adversarial examples from the steganalysis point of view. CVPR, pp.  4820–4829, 2019.
  18. Effective and robust detection of adversarial examples via benford-fourier coefficients. arXiv, abs/2005.05552, 2020.
  19. Characterizing adversarial subspaces using local intrinsic dimensionality. arXiv, abs/1801.02613, 2018.
  20. Towards deep learning models resistant to adversarial attacks. arXiv, abs/1706.06083, 2018.
  21. On detecting adversarial perturbations. arXiv, abs/1702.04267, 2017.
  22. Technical report on the cleverhans v2.1.0 adversarial examples library. arXiv: Learning, 2016.
  23. Foolbox: A python toolbox to benchmark the robustness of machine learning models. arXiv, abs/1707.04131, 2018.
  24. On adaptive attacks to adversarial example defenses. arXiv, abs/2002.08347, 2020.
  25. On the structural sensitivity of deep convolutional networks to the directions of fourier basis functions. CVPR, pp.  51–60, 2019.
  26. A fourier perspective on model robustness in computer vision. arXiv, abs/1906.08988, 2019.
  27. Wide residual networks. In BMVC, 2017.
Citations (14)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.