Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Rethinking Invariance Regularization in Adversarial Training to Improve Robustness-Accuracy Trade-off (2402.14648v2)

Published 22 Feb 2024 in cs.LG and cs.AI

Abstract: Although adversarial training has been the state-of-the-art approach to defend against adversarial examples (AEs), it suffers from a robustness-accuracy trade-off, where high robustness is achieved at the cost of clean accuracy. In this work, we leverage invariance regularization on latent representations to learn discriminative yet adversarially invariant representations, aiming to mitigate this trade-off. We analyze two key issues in representation learning with invariance regularization: (1) a "gradient conflict" between invariance loss and classification objectives, leading to suboptimal convergence, and (2) the mixture distribution problem arising from diverged distributions of clean and adversarial inputs. To address these issues, we propose Asymmetrically Representation-regularized Adversarial Training (AR-AT), which incorporates asymmetric invariance loss with stop-gradient operation and a predictor to improve the convergence, and a split-BatchNorm (BN) structure to resolve the mixture distribution problem. Our method significantly improves the robustness-accuracy trade-off by learning adversarially invariant representations without sacrificing discriminative ability. Furthermore, we discuss the relevance of our findings to knowledge-distillation-based defense methods, contributing to a deeper understanding of their relative successes.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (31)
  1. Exploring simple siamese representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp.  15750–15758, 2021.
  2. Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. In International conference on machine learning, pp.  2206–2216. PMLR, 2020.
  3. Learnable boundary guided adversarial training. In Proceedings of the IEEE/CVF international conference on computer vision, pp.  15721–15730, 2021.
  4. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pp.  248–255. Ieee, 2009.
  5. Explaining and harnessing adversarial examples. In ICLR, 2015.
  6. Bootstrap your own latent-a new approach to self-supervised learning. Advances in neural information processing systems, 33:21271–21284, 2020.
  7. Deep residual learning for image recognition. In Proceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition, pp.  770–778, 2016.
  8. Distilling the knowledge in a neural network. Workshop on Advances in neural information processing systems, 2014.
  9. Howard, J. A smaller subset of 10 easily classified classes from imagenet, and a little more french. https://github.com/fastai/imagenette, 2019.
  10. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International conference on machine learning, pp.  448–456. pmlr, 2015.
  11. Enhancing adversarial training with second-order statistics of weights. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  15273–15283, 2022.
  12. Learning multiple layers of features from tiny images. Technical report, University of Toronto, 2009.
  13. Towards deep learning models resistant to adversarial attacks. In ICLR, 2018.
  14. Domain generalization via gradient surgery. In Proceedings of the IEEE/CVF international conference on computer vision, pp.  6630–6638, 2021.
  15. Umap: Uniform manifold approximation and projection for dimension reduction. arXiv preprint arXiv:1802.03426, 2018.
  16. Adversarial finetuning with latent representation constraint to mitigate accuracy-robustness tradeoff. In 2023 IEEE/CVF International Conference on Computer Vision (ICCV), pp.  4367–4378. IEEE, 2023.
  17. Intriguing properties of neural networks. In ICLR, 2014.
  18. Robustness may be at odds with accuracy. In ICLR, 2019.
  19. Improving adversarial robustness requires revisiting misclassified examples. In International conference on learning representations, 2019.
  20. The mechanism of prediction head in non-contrastive self-supervised learning. Advances in Neural Information Processing Systems, 35:24794–24809, 2022.
  21. Adversarial weight perturbation helps robust generalization. Advances in Neural Information Processing Systems, 33:2958–2969, 2020.
  22. Intriguing properties of adversarial training at scale. arXiv preprint arXiv:1906.03787, 2019.
  23. Adversarial examples improve image recognition. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp.  819–828, 2020.
  24. Autolora: A parameter-free automated robust fine-tuning framework. arXiv preprint arXiv:2310.01818, 2023.
  25. Gradient surgery for multi-task learning. Advances in Neural Information Processing Systems, 33:5824–5836, 2020.
  26. Wide residual networks. arXiv preprint arXiv:1605.07146, 2016.
  27. How does simsiam avoid collapse without negative samples? a unified understanding with self-supervised contrastive learning. International conference on learning representations, 2022.
  28. Theoretically principled trade-off between robustness and accuracy. In International conference on machine learning, pp.  7472–7482. PMLR, 2019.
  29. Attacks which do not kill training make adversarial learning stronger. In International conference on machine learning, pp.  11278–11287. PMLR, 2020.
  30. Reliable adversarial distillation with unreliable teachers. arXiv preprint arXiv:2106.04928, 2021.
  31. Towards a unified theoretical understanding of non-contrastive learning via rank differential mechanism. International conference on learning representations, 2023.

Summary

We haven't generated a summary for this paper yet.