Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
126 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Towards Understanding Dual BN In Hybrid Adversarial Training (2403.19150v1)

Published 28 Mar 2024 in cs.LG, cs.AI, cs.CR, and cs.CV

Abstract: There is a growing concern about applying batch normalization (BN) in adversarial training (AT), especially when the model is trained on both adversarial samples and clean samples (termed Hybrid-AT). With the assumption that adversarial and clean samples are from two different domains, a common practice in prior works is to adopt Dual BN, where BN and BN are used for adversarial and clean branches, respectively. A popular belief for motivating Dual BN is that estimating normalization statistics of this mixture distribution is challenging and thus disentangling it for normalization achieves stronger robustness. In contrast to this belief, we reveal that disentangling statistics plays a less role than disentangling affine parameters in model training. This finding aligns with prior work (Rebuffi et al., 2023), and we build upon their research for further investigations. We demonstrate that the domain gap between adversarial and clean samples is not very large, which is counter-intuitive considering the significant influence of adversarial perturbation on the model accuracy. We further propose a two-task hypothesis which serves as the empirical foundation and a unified framework for Hybrid-AT improvement. We also investigate Dual BN in test-time and reveal that affine parameters characterize the robustness during inference. Overall, our work sheds new light on understanding the mechanism of Dual BN in Hybrid-AT and its underlying justification.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (57)
  1. Understanding and improving fast adversarial training. NeurIPS, 2020.
  2. Towards an adversarially robust normalization approach. arXiv preprint arXiv:2006.11007, 2020.
  3. Recent advances in adversarial training for adversarial robustness. arXiv preprint arXiv:2102.01356, 2021.
  4. Revisiting batch normalization for improving corruption robustness. WACV, 2021a.
  5. Batch normalization increases adversarial vulnerability and decreases adversarial transferability: A non-robust feature perspective. In ICCV, 2021b.
  6. Understanding batch normalization. Advances in neural information processing systems, 31, 2018.
  7. Unlabeled data improves adversarial robustness. NeurIPS, 2019.
  8. Domain-specific batch normalization for unsupervised domain adaptation. In CVPR, 2019.
  9. Adversarial masking: Towards understanding robustness trade-off for generalization. 2020.
  10. Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. In ICML, 2020.
  11. Hal Daumé III. Frustratingly easy domain adaptation. In Association of Computational Linguistics (ACL), 2007.
  12. Make some noise: Reliable and efficient single-step adversarial training. arXiv preprint arXiv:2202.01181, 2022.
  13. Random normalization aggregation for adversarial defense. Advances in Neural Information Processing Systems, 35:33676–33688, 2022.
  14. Boosting adversarial attacks with momentum. In CVPR, 2018.
  15. When does contrastive learning preserve adversarial robustness from pretraining to finetuning? NeurIPS, 2021.
  16. Domain-adversarial training of neural networks. Journal of Machine Learning Research, 2016.
  17. Sandwich batch normalization: A drop-in replacement for feature distribution heterogeneity. In WACV, 2022.
  18. Explaining and harnessing adversarial examples. In ICLR, 2015.
  19. Uncovering the limits of adversarial training against norm-bounded adversarial examples. arXiv preprint arXiv:2010.03593, 2020.
  20. Deep residual learning for image recognition. In CVPR, 2016.
  21. Densely connected convolutional networks. In CVPR, 2017.
  22. Reciprocal normalization for domain adaptation. Pattern Recognition, 140:109533, 2023.
  23. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In ICML, 2015.
  24. Prior-guided adversarial initialization for fast adversarial training. In Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part IV, pp.  567–584. Springer, 2022a.
  25. Boosting fast adversarial training with learnable adversarial initialization. IEEE Transactions on Image Processing, 2022b.
  26. Robust pre-training by adversarial contrastive learning. NeurIPS, 2020.
  27. Adversarial logit pairing. arXiv preprint arXiv:1803.06373, 2018.
  28. Alex Krizhevsky et al. Learning multiple layers of features from tiny images. ., 2009.
  29. Demystifying resnet. arXiv preprint arXiv:1611.01186, 2016.
  30. Revisiting batch normalization for practical domain adaptation. ICLR workshop, 2017.
  31. Towards deep learning models resistant to adversarial attacks. In ICLR, 2018.
  32. Adversarially trained models with test-time covariate shift adaptation. arXiv preprint arXiv:2102.05096, 2021.
  33. Bag of tricks for adversarial training. arXiv preprint arXiv:2010.00467, 2020.
  34. Reliably fast adversarial training via latent adversarial perturbation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp.  7758–7767, 2021.
  35. Review of artificial intelligence adversarial attack and defense technologies. Applied Sciences, 2019.
  36. Revisiting adapters with adversarial training. In The Eleventh International Conference on Learning Representations, 2023.
  37. How does batch normalization help optimization? In NeurIPS, 2018.
  38. Improving robustness against common corruptions by covariate shift adaptation. NeurIPS, 2020.
  39. Adversarial training for free! In NeurIPS, 2019.
  40. Improving the accuracy-robustness trade-off for dual-domain adversarial training.
  41. Correlation alignment for unsupervised domain adaptation. In Domain Adaptation in Computer Vision Applications. 2017.
  42. Resnet in resnet: Generalizing residual architectures. arXiv preprint arXiv:1603.08029, 2016.
  43. Are labels required for improving adversarial robustness? NeurIPS, 2019.
  44. Once-for-all adversarial training: In-situ tradeoff between robustness and accuracy for free. arXiv preprint arXiv:2010.11828, 2020.
  45. Augmax: Adversarial composition of random augmentations for robust training. NeurIPS, 2021.
  46. Fast is better than free: Revisiting adversarial training. ICLR, 2020.
  47. Wider or deeper: Revisiting the resnet model for visual recognition. Pattern Recognition, 2019.
  48. Intriguing properties of adversarial training at scale. ICLR, 2020.
  49. Adversarial examples improve image recognition. In CVPR, 2020a.
  50. Smooth adversarial training. arXiv preprint arXiv:2006.14536, 2020b.
  51. Adversarial momentum-contrastive pre-training. arXiv preprint arXiv:2012.13154, 2020.
  52. Revisiting residual networks with nonlinear shortcuts. In BMVC, 2019a.
  53. Resnet or densenet? introducing dense shortcuts to resnet. In WACV, 2021a.
  54. A survey on universal adversarial attack. IJCAI, 2021b.
  55. Decoupled adversarial contrastive learning for self-supervised adversarial robustness. In ECCV, pp.  725–742. Springer, 2022.
  56. Theoretically principled trade-off between robustness and accuracy. In ICML, 2019b.
  57. Where is the bottleneck of adversarial learning with unlabeled data? arXiv preprint arXiv:1911.08696, 2019c.
Citations (2)

Summary

We haven't generated a summary for this paper yet.