Adversarial Fine-tuning of Compressed Neural Networks for Joint Improvement of Robustness and Efficiency (2403.09441v1)
Abstract: As deep learning (DL) models are increasingly being integrated into our everyday lives, ensuring their safety by making them robust against adversarial attacks has become increasingly critical. DL models have been found to be susceptible to adversarial attacks which can be achieved by introducing small, targeted perturbations to disrupt the input data. Adversarial training has been presented as a mitigation strategy which can result in more robust models. This adversarial robustness comes with additional computational costs required to design adversarial attacks during training. The two objectives -- adversarial robustness and computational efficiency -- then appear to be in conflict of each other. In this work, we explore the effects of two different model compression methods -- structured weight pruning and quantization -- on adversarial robustness. We specifically explore the effects of fine-tuning on compressed models, and present the trade-off between standard fine-tuning and adversarial fine-tuning. Our results show that compression does not inherently lead to loss in model robustness and adversarial fine-tuning of a compressed model can yield large improvement to the robustness performance of models. We present experiments on two benchmark datasets showing that adversarial fine-tuning of compressed models can achieve robustness performance comparable to adversarially trained models, while also improving computational efficiency.
- Adversarial robustness - theory and practice. https://adversarial-ml-tutorial.org/.
- Towards deep learning models resistant to adversarial attacks. In International Conference on Learning Representations, 2018.
- Adversarial machine learning at scale. International Conference on Learning Representations, 2016.
- Compute-efficient deep learning: Algorithmic trends and opportunities. Journal of Machine Learning Research, 2023.
- Evasion attacks against machine learning at test time. In Machine Learning and Knowledge Discovery in Databases: European Conference, (ECML PKDD), 2013.
- Fairness without demographics through knowledge distillation. In A. H. Oh, A. Agarwal, D. Belgrave, and K. Cho, editors, Advances in Neural Information Processing Systems, 2022.
- Learning efficient object detection models with knowledge distillation. Advances in neural information processing systems (NeurIPS), 2017.
- F. Croce and M. Hein. Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. In ICML, 2020.
- Robustbench: a standardized adversarial robustness benchmark. arXiv preprint arXiv:2010.09670, 2020.
- 8-bit Optimizers via Block-wise Quantization. In The Tenth International Conference on Learning Representations (ICLR), 2022.
- S. Eliassen and R. Selvan. Activation compression of graph neural networks using block-wise quantization with improved variance minimization. In International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2024.
- J. Frankle and M. Carbin. The lottery ticket hypothesis: Finding sparse, trainable neural networks. In International Conference on Learning Representations, 2019.
- Adversarially robust distillation. Proceedings of the AAAI Conference on Artificial Intelligence, 2020.
- Explaining and harnessing adversarial examples. In International Conference on Learning Representations, 2015.
- A quantitative study of pruning by optimal brain damage. International journal of neural systems, 1993.
- Model compression with adversarial robustness: A unified optimization framework. In Advances in Neural Information Processing Systems. Curran Associates, Inc., 2019.
- Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, 2016.
- Distilling the knowledge in a neural network. In Neural Information Processing Systems, 2014.
- Sparsity in deep learning: Pruning and growth for efficient inference and training in neural networks. Journal of Machine Learning Research (JMLR), 2021.
- Characterising bias in compressed models. Arxiv, 2020.
- Safety verification of deep neural networks. In Computer Aided Verification: 29th International Conference (CAV), 2017.
- Binarized neural networks. Advances in neural information processing systems (NeurIPS), 2016.
- Quantized neural networks: Training neural networks with low precision weights and activations. Journal of Machine Learning Research (JMLR), 2018.
- Quantization and training of neural networks for efficient integer-arithmetic-only inference. In Proceedings of the IEEE conference on computer vision and pattern recognition, 2018.
- A simple fine-tuning is all you need: Towards robust deep learning via adversarial fine-tuning. Workshop on Adversarial Machine Learning in Real-World Computer Vision Systems and Online Challenges (AML-CV), 2021.
- Pruning adversarially robust neural networks without adversarial examples. In 2022 IEEE International Conference on Data Mining (ICDM). IEEE.
- A. Jordao and H. Pedrini. On the effect of pruning on adversarial robustness. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021.
- Fair feature distillation for visual recognition. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 12110–12119, 2021.
- Pruning vs quantization: Which is better? Advances in Neural Information Processing Systems, 2024.
- Optimal brain damage. Advances in neural information processing systems, 1989.
- Pruning filters for efficient convnets. In International Conference on Learning Representations, 2017.
- Towards deep learning models resistant to adversarial attacks. In International Conference on Learning Representations (ICLR), 2018.
- On the benefits of knowledge distillation for adversarial robustness, 2022.
- Microsoft. Neural Network Intelligence, 2021. URL https://github.com/microsoft/nni.
- Deepfool: a simple and accurate method to fool deep neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, 2016.
- A white paper on neural network quantization, 2021.
- Tensorizing neural networks. Advances in neural information processing systems (NeurIPS), 2015.
- I. V. Oseledets. Tensor-train decomposition. SIAM Journal on Scientific Computing, 2011.
- Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, 2019.
- A comparative study on the impact of model compression techniques on fairness in language models. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics, 2023.
- Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. 5th Workshop on Energy Efficient Machine Learning and Cognitive Computing - NeurIPS 2019, 2019.
- Hydra: Pruning adversarially robust neural networks. In Advances in Neural Information Processing Systems. Curran Associates, Inc., 2020.
- Compute trends across three eras of machine learning. In International Joint Conference on Neural Networks (IJCNN). IEEE, 2022.
- Adversarial training for free! In Advances in Neural Information Processing Systems. Curran Associates, Inc., 2019.
- How and when adversarial robustness transfers in knowledge distillation?, 2021.
- S. Stoychev and H. Gunes. The effect of model compression on fairness in facial expression recognition. Arxiv, 2022.
- Energy and policy considerations for deep learning in nlp. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019.
- Ensemble adversarial training: Attacks and defenses. In International Conference on Learning Representations, 2018.
- L. Van der Maaten and G. Hinton. Visualizing data using t-sne. Journal of machine learning research, 2008.
- Exploring extreme parameter compression for pre-trained language models. In International Conference on Learning Representations (ICLR), 2021.
- Exploring extreme parameter compression for pre-trained language models. In International Conference on Learning Representations (ICLR), 2022.
- Fast is better than free: Revisiting adversarial training. In International Conference on Learning Representations, 2020.
- Adversarial robustness vs. model compression, or both? In Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019.
- Towards efficient tensor decomposition-based dnn model compression with optimization framework. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
- M. D. Zeiler and R. Fergus. Visualizing and understanding convolutional networks. In European Conference on Computer Vision (ECCV), 2014.
- Hallgrimur Thorsteinsson (1 paper)
- Valdemar J Henriksen (1 paper)
- Tong Chen (200 papers)
- Raghavendra Selvan (39 papers)