Samples on Thin Ice: Re-Evaluating Adversarial Pruning of Neural Networks (2310.08073v1)
Abstract: Neural network pruning has shown to be an effective technique for reducing the network size, trading desirable properties like generalization and robustness to adversarial attacks for higher sparsity. Recent work has claimed that adversarial pruning methods can produce sparse networks while also preserving robustness to adversarial examples. In this work, we first re-evaluate three state-of-the-art adversarial pruning methods, showing that their robustness was indeed overestimated. We then compare pruned and dense versions of the same models, discovering that samples on thin ice, i.e., closer to the unpruned model's decision boundary, are typically misclassified after pruning. We conclude by discussing how this intuition may lead to designing more effective adversarial pruning methods in future work.
- Square attack: A query-efficient black-box adversarial attack via random search. In A. Vedaldi, H. Bischof, T. Brox, and J. Frahm, editors, Computer Vision - ECCV 2020 - 16th European Conference, Glasgow, UK, August 23-28, 2020, Proceedings, Part XXIII, volume 12368 of Lecture Notes in Computer Science, pages 484–501. Springer, 2020.
- Evasion attacks against machine learning at test time. In ECML PKDD, Part III, volume 8190 of LNCS, pp. 387–402. Springer Berlin Heidelberg, 2013.
- B. Biggio and F. Roli. Wild patterns: Ten years after the rise of adversarial machine learning. Patt. Rec., 84:317–331, 2018.
- What is the state of neural network pruning? In I. S. Dhillon, D. S. Papailiopoulos, and V. Sze, editors, Proc. Machine Learning and Systems 2020, MLSys 2020, Austin, TX, USA, March 2-4, 2020.
- N. Carlini and D. A. Wagner. Towards evaluating the robustness of neural networks. In IEEE Symp. on Sec. and Privacy, pages 39–57. IEEE Computer Society, 2017.
- Robustbench: a standardized adversarial robustness benchmark.
- F. Croce and M. Hein. Minimally distorted adversarial examples with a fast adaptive boundary attack. In ICML, 2020.
- F. Croce and M. Hein. Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. In ICML, 2020.
- Explaining and harnessing adversarial examples. In ICLR, 2015.
- Model compression with adversarial robustness: A unified optimization framework. In NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada, pages 1283–1294, 2019.
- What do compressed deep neural networks forget?, 2019.
- Optimal brain damage. In D. S. Touretzky, editor, [NIPS Conference, Denver, Colorado, USA, November 27-30, 1989], pages 598–605. Morgan Kaufmann, 1989.
- Lost in pruning: The effects of pruning neural networks beyond test accuracy. In A. Smola, A. Dimakis, and I. Stoica, editors, Proceedings of Machine Learning and Systems 2021, MLSys 2021, virtual, April 5-9, 2021. mlsys.org, 2021.
- Towards deep learning models resistant to adversarial attacks. In ICLR, 2018.
- Fast minimum-norm adversarial attacks through adaptive norm constraints. NeurIPS, 34, 2021.
- Towards compact and robust deep neural networks. arXiv, 2019.
- HYDRA: pruning adversarially robust neural networks. In H. Larochelle, M. Ranzato, R. Hadsell, M. Balcan, and H. Lin, editors, NeurIPS 2020, December 6-12, 2020, virtual, 2020.
- Pruning has a disparate impact on model accuracy. arXiv, 2022.
- Adversarial robustness vs. model compression, or both? In ICCV 2019, Seoul, Korea (South), October 27 - November 2, 2019, pages 111–120. IEEE, 2019.
- Pruning by explaining: A novel criterion for deep neural network pruning. Patt. Rec., 115:107899, 2021.
- Giorgio Piras (7 papers)
- Maura Pintor (24 papers)
- Ambra Demontis (34 papers)
- Battista Biggio (81 papers)