Papers
Topics
Authors
Recent
Search
2000 character limit reached

MALT Powers Up Adversarial Attacks

Published 2 Jul 2024 in cs.LG, cs.CR, cs.NE, and stat.ML | (2407.02240v1)

Abstract: Current adversarial attacks for multi-class classifiers choose the target class for a given input naively, based on the classifier's confidence levels for various target classes. We present a novel adversarial targeting method, \textit{MALT - Mesoscopic Almost Linearity Targeting}, based on medium-scale almost linearity assumptions. Our attack wins over the current state of the art AutoAttack on the standard benchmark datasets CIFAR-100 and ImageNet and for a variety of robust models. In particular, our attack is \emph{five times faster} than AutoAttack, while successfully matching all of AutoAttack's successes and attacking additional samples that were previously out of reach. We then prove formally and demonstrate empirically that our targeting method, although inspired by linear predictors, also applies to standard non-linear models.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (43)
  1. Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. In International conference on machine learning, pages 274–283. PMLR, 2018.
  2. Adversarial examples in multi-layer random relu networks. Advances in Neural Information Processing Systems, 34, 2021.
  3. S. Bubeck and M. Sellke. A universal law of robustness via isoperimetry. Advances in Neural Information Processing Systems, 34:28811–28822, 2021.
  4. Adversarial examples from computational constraints. In International Conference on Machine Learning, pages 831–840. PMLR, 2019.
  5. A single gradient step finds adversarial examples on random two-layers neural networks. Advances in Neural Information Processing Systems, 34, 2021.
  6. N. Carlini and D. Wagner. Adversarial examples are not easily detected: Bypassing ten detection methods. In Proceedings of the 10th ACM workshop on artificial intelligence and security, pages 3–14, 2017.
  7. F. Croce and M. Hein. Minimally distorted adversarial examples with a fast adaptive boundary attack. In International Conference on Machine Learning, pages 2196–2205. PMLR, 2020a.
  8. F. Croce and M. Hein. Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. In International conference on machine learning, pages 2206–2216. PMLR, 2020b.
  9. Robustbench: a standardized adversarial robustness benchmark. arXiv preprint arXiv:2010.09670, 2020.
  10. Decoupled kullback-leibler divergence loss. arXiv preprint arXiv:2305.13948, 2023.
  11. A. Daniely and H. Shacham. Most relu networks suffer from ℓ2subscriptℓ2\ell_{2}roman_ℓ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT adversarial perturbations. Advances in Neural Information Processing Systems, 33, 2020.
  12. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pages 248–255. Ieee, 2009.
  13. Adversarial vulnerability for any classifier. Advances in neural information processing systems, 31, 2018.
  14. The double-edged sword of implicit bias: Generalization vs. robustness in relu networks. Advances in Neural Information Processing Systems, 36, 2024.
  15. Adversarial spheres. arXiv preprint arXiv:1801.02774, 2018.
  16. Explaining and harnessing adversarial examples. Preprint, arXiv:1412.6572, 2014.
  17. Uncovering the limits of adversarial training against norm-bounded adversarial examples. arXiv preprint arXiv:2010.03593, 2020.
  18. Backpropagating linearly improves transferability of adversarial examples. Advances in neural information processing systems, 33:85–95, 2020.
  19. Effect of ambient-intrinsic dimension gap on adversarial vulnerability. In International Conference on Artificial Intelligence and Statistics, pages 1090–1098. PMLR, 2024.
  20. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proceedings of the IEEE international conference on computer vision, pages 1026–1034, 2015.
  21. Neural tangent kernel: Convergence and generalization in neural networks. Advances in neural information processing systems, 31, 2018.
  22. M. Khoury and D. Hadfield-Menell. On the geometry of adversarial examples. Preprint, arXiv:1811.00525, 2018.
  23. Learning multiple layers of features from tiny images. 2009.
  24. B. Laurent and P. Massart. Adaptive estimation of a quadratic functional by model selection. The Annals of Statistics, 28(5):1302 – 1338, 2000. doi: 10.1214/aos/1015957395. URL https://doi.org/10.1214/aos/1015957395.
  25. A comprehensive study on robustness of image classification models: Benchmarking and rethinking. arXiv preprint arXiv:2302.14301, 2023.
  26. Towards deep learning models resistant to adversarial attacks. Preprint, arXiv:1706.06083, 2017.
  27. Adversarial examples exist in two-layer relu networks for low dimensional data manifolds. arXiv preprint arXiv:2303.00783, 2023.
  28. A. Montanari and Y. Wu. Adversarial examples in random neural networks with general activations. arXiv preprint arXiv:2203.17209, 2022.
  29. Deepfool: a simple and accurate method to fool deep neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2574–2582, 2016.
  30. Distillation as a defense to adversarial perturbations against deep neural networks. In 2016 IEEE symposium on security and privacy (SP), pages 582–597. IEEE, 2016.
  31. Practical black-box attacks against machine learning. In Proceedings of the 2017 ACM on Asia conference on computer and communications security, pages 506–519, 2017.
  32. Adversarial robustness through local linearization. Advances in neural information processing systems, 32, 2019.
  33. A. Sarkar and R. Iyengar. Enforcing linearity in dnn succours robustness and adversarial image generation. In Artificial Neural Networks and Machine Learning–ICANN 2020: 29th International Conference on Artificial Neural Networks, Bratislava, Slovakia, September 15–18, 2020, Proceedings, Part I 29, pages 52–64. Springer, 2020.
  34. A simple explanation for the existence of adversarial examples with small hamming distance. Preprint, arXiv:1901.10861, 2019.
  35. The dimpled manifold model of adversarial examples in machine learning. Preprint, arXiv:2106.10151, 2021.
  36. Revisiting adversarial training for imagenet: Architectures, training and generalization across threat models. Advances in Neural Information Processing Systems, 36, 2024.
  37. Pixeldefend: Leveraging generative models to understand and defend against adversarial examples. arXiv preprint arXiv:1710.10766, 2017.
  38. Disentangling adversarial robustness and generalization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6976–6987, 2019.
  39. Intriguing properties of neural networks. Preprint, arXiv:1312.6199, 2013.
  40. T. Tanay and L. Griffin. A boundary tilting persepective on the phenomenon of adversarial examples. arXiv preprint arXiv:1608.07690, 2016.
  41. Adversarial risk and the dangers of evaluating against weak attacks. In International Conference on Machine Learning, pages 5025–5034. PMLR, 2018.
  42. Gradient methods provably converge to non-robust networks. Preprint, arXiv:2202.04347, 2022.
  43. Better diffusion models further improve adversarial training. In International Conference on Machine Learning, pages 36246–36263. PMLR, 2023.

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 0 likes about this paper.