Ensemble Adversarial Defense via Integration of Multiple Dispersed Low Curvature Models (2403.16405v1)
Abstract: The integration of an ensemble of deep learning models has been extensively explored to enhance defense against adversarial attacks. The diversity among sub-models increases the attack cost required to deceive the majority of the ensemble, thereby improving the adversarial robustness. While existing approaches mainly center on increasing diversity in feature representations or dispersion of first-order gradients with respect to input, the limited correlation between these diversity metrics and adversarial robustness constrains the performance of ensemble adversarial defense. In this work, we aim to enhance ensemble diversity by reducing attack transferability. We identify second-order gradients, which depict the loss curvature, as a key factor in adversarial robustness. Computing the Hessian matrix involved in second-order gradients is computationally expensive. To address this, we approximate the Hessian-vector product using differential approximation. Given that low curvature provides better robustness, our ensemble model was designed to consider the influence of curvature among different sub-models. We introduce a novel regularizer to train multiple more-diverse low-curvature network models. Extensive experiments across various datasets demonstrate that our ensemble model exhibits superior robustness against a range of attacks, underscoring the effectiveness of our approach.
- Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572, 2014.
- Hierarchical distribution-aware testing of deep learning. ACM Transactions on Software Engineering and Methodology, 33(2):1–35, 2023.
- Practical verification of neural network enabled state estimation system for robotics. In 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 7336–7343. IEEE, 2020.
- Formal verification of robustness and resilience of learning-enabled state estimation systems. arXiv preprint arXiv:2010.08311, 2020.
- Is robustness the cost of accuracy?–a comprehensive study on the robustness of 18 deep image classification models. In Proceedings of the European conference on computer vision (ECCV), pages 631–648, 2018.
- Defense against adversarial attacks using high-level representation guided denoiser. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1778–1787, 2018.
- Adversarial attacks and defences competition. In The NIPS’17 Competition: Building Intelligent Systems, pages 195–231. Springer, 2018.
- Ensemble adversarial training: Attacks and defenses. arXiv preprint arXiv:1705.07204, 2017.
- Ensemble defense with data diversity: Weak correlation implies strong robustness. arXiv preprint arXiv:2106.02867, 2021.
- Voting based ensemble improves robustness of defensive models. arXiv preprint arXiv:2011.14031, 2020.
- Improving robustness to adversarial examples by encouraging discriminative features. In 2019 IEEE International Conference on Image Processing (ICIP), pages 3801–3505. IEEE, 2019.
- Improving adversarial robustness via promoting ensemble diversity. In International Conference on Machine Learning, pages 4970–4979. PMLR, 2019.
- Adversarial defence by diversified simultaneous training of deep ensembles. In Proceedings of the AAAI conference on artificial intelligence, volume 35, pages 7823–7831, 2021.
- Transferability in machine learning: from phenomena to black-box attacks using adversarial samples. arXiv preprint arXiv:1605.07277, 2016.
- Improving the adversarial robustness and interpretability of deep neural networks by regularizing their input gradients. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 32, 2018.
- The space of transferable adversarial examples. arXiv preprint arXiv:1704.03453, 2017.
- Evading defenses to transferable adversarial examples by translation-invariant attacks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4312–4321, 2019.
- Improving transferability of adversarial examples with input diversity. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2730–2739, 2019.
- Robust learning with jacobian regularization. arXiv preprint arXiv:1908.02729, 2019.
- Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. In International conference on machine learning, pages 2206–2216. PMLR, 2020.
- Why do adversarial attacks transfer? explaining transferability of evasion and poisoning attacks. In 28th USENIX security symposium (USENIX security 19), pages 321–338, 2019.
- Improving adversarial robustness of ensembles with diversity training. arXiv preprint arXiv:1901.09981, 2019.
- A simple framework to enhance the adversarial robustness of deep learning-based intrusion detection system. Computers & Security, 137:103644, 2024.
- Hessian-aware zeroth-order optimization for black-box adversarial attack. arXiv preprint arXiv:1812.11377, 2018.
- Hessian-based analysis of large batch training and robustness to adversaries. Advances in Neural Information Processing Systems, 31, 2018.
- Second-order provable defenses against adversarial attacks. In International conference on machine learning, pages 8981–8991. PMLR, 2020.
- Enhancing adversarial training with second-order statistics of weights. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 15273–15283, 2022.
- Second-order adversarial attack and certifiable robustness. 2018.
- Soar: Second-order adversarial regularization. arXiv preprint arXiv:2004.01832, 2020.
- Hessian-free second-order adversarial examples for adversarial learning. arXiv preprint arXiv:2207.01396, 2022.
- Input hessian regularization of neural networks. arXiv preprint arXiv:2009.06571, 2020.
- Robustness via curvature regularization, and vice versa. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9078–9086, 2019.
- A constructive algorithm for training cooperative neural network ensembles. IEEE Transactions on neural networks, 14(4):820–834, 2003.
- With friends like these, who needs adversaries? Advances in neural information processing systems, 31, 2018.
- Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083, 2017.
- Adversarial machine learning at scale. arXiv preprint arXiv:1611.01236, 2016.
- Practical black-box attacks against machine learning. In Proceedings of the 2017 ACM on Asia conference on computer and communications security, pages 506–519, 2017.
- A stochastic approximation method. The annals of mathematical statistics, pages 400–407, 1951.
- Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
- Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
- Kaikang Zhao (1 paper)
- Xi Chen (1036 papers)
- Wei Huang (318 papers)
- Liuxin Ding (1 paper)
- Xianglong Kong (3 papers)
- Fan Zhang (686 papers)