Papers
Topics
Authors
Recent
2000 character limit reached

Learn from the Past: A Proxy Guided Adversarial Defense Framework with Self Distillation Regularization (2310.12713v2)

Published 19 Oct 2023 in cs.LG

Abstract: Adversarial Training (AT), pivotal in fortifying the robustness of deep learning models, is extensively adopted in practical applications. However, prevailing AT methods, relying on direct iterative updates for target model's defense, frequently encounter obstacles such as unstable training and catastrophic overfitting. In this context, our work illuminates the potential of leveraging the target model's historical states as a proxy to provide effective initialization and defense prior, which results in a general proxy guided defense framework, `LAST' ({\bf L}earn from the P{\bf ast}). Specifically, LAST derives response of the proxy model as dynamically learned fast weights, which continuously corrects the update direction of the target model. Besides, we introduce a self-distillation regularized defense objective, ingeniously designed to steer the proxy model's update trajectory without resorting to external teacher models, thereby ameliorating the impact of catastrophic overfitting on performance. Extensive experiments and ablation studies showcase the framework's efficacy in markedly improving model robustness (e.g., up to 9.2\% and 20.3\% enhancement in robust accuracy on CIFAR10 and CIFAR100 datasets, respectively) and training stability. These improvements are consistently observed across various model architectures, larger datasets, perturbation sizes, and attack modalities, affirming LAST's ability to consistently refine both single-step and multi-step AT strategies. The code will be available at~\url{https://github.com/callous-youth/LAST}.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (57)
  1. A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” Advances in neural information processing systems, vol. 25, 2012.
  2. S. Jian, H. Kaiming, R. Shaoqing, and Z. Xiangyu, “Deep residual learning for image recognition,” in IEEE Conference on Computer Vision & Pattern Recognition, 2016, pp. 770–778.
  3. H. Huang, Z. Chen, H. Chen, Y. Wang, and K. Zhang, “T-sea: Transfer-based self-ensemble attack on object detection,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 20 514–20 523.
  4. A. Kurakin, I. J. Goodfellow, and S. Bengio, “Adversarial examples in the physical world,” in Artificial intelligence safety and security.   Chapman and Hall/CRC, 2018, pp. 99–112.
  5. N. Carlini and D. Wagner, “Adversarial examples are not easily detected: Bypassing ten detection methods,” in Proceedings of the 10th ACM workshop on artificial intelligence and security, 2017, pp. 3–14.
  6. J. Gu, X. Jia, P. de Jorge, W. Yu, X. Liu, A. Ma, Y. Xun, A. Hu, A. Khakzar, Z. Li et al., “A survey on transferability of adversarial examples across deep neural networks,” arXiv preprint arXiv:2310.17626, 2023.
  7. P. Dai, R. Ji, H. Wang, Q. Wu, and Y. Huang, “Cross-modality person re-identification with generative adversarial training.” in IJCAI, vol. 1, no. 3, 2018, p. 6.
  8. C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus, “Intriguing properties of neural networks,” arXiv preprint arXiv:1312.6199, 2013.
  9. N. Papernot, P. McDaniel, X. Wu, S. Jha, and A. Swami, “Distillation as a defense to adversarial perturbations against deep neural networks,” in 2016 IEEE symposium on security and privacy (SP).   IEEE, 2016, pp. 582–597.
  10. T. Chen, Z. Zhang, S. Liu, S. Chang, and Z. Wang, “Robust overfitting may be mitigated by properly learned smoothening,” in International Conference on Learning Representations, 2020.
  11. F. Latorre, I. Krawczuk, L. T. Dadi, T. M. Pethick, and V. Cevher, “Finding actual descent directions for adversarial training,” in 11th International Conference on Learning Representations (ICLR), 2023.
  12. H. Zhang, Y. Yu, J. Jiao, E. Xing, L. El Ghaoui, and M. Jordan, “Theoretically principled trade-off between robustness and accuracy,” in International conference on machine learning.   PMLR, 2019, pp. 7472–7482.
  13. Y. Dong, Z. Deng, T. Pang, J. Zhu, and H. Su, “Adversarial distributional training for robust deep learning,” Advances in Neural Information Processing Systems, vol. 33, pp. 8270–8283, 2020.
  14. A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu, “Towards deep learning models resistant to adversarial attacks,” arXiv preprint arXiv:1706.06083, 2017.
  15. A. Shafahi, M. Najibi, M. A. Ghiasi, Z. Xu, J. Dickerson, C. Studer, L. S. Davis, G. Taylor, and T. Goldstein, “Adversarial training for free!” Advances in Neural Information Processing Systems, vol. 32, 2019.
  16. S.-A. Rebuffi, F. Croce, and S. Gowal, “Revisiting adapters with adversarial training,” arXiv preprint arXiv:2210.04886, 2022.
  17. Z. Yuan, J. Zhang, Y. Jia, C. Tan, T. Xue, and S. Shan, “Meta gradient adversarial attack,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 7748–7757.
  18. I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessing adversarial examples,” arXiv preprint arXiv:1412.6572, 2014.
  19. X. Jia, Y. Zhang, B. Wu, K. Ma, J. Wang, and X. Cao, “Las-at: adversarial training with learnable attack strategy,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 13 398–13 408.
  20. Y. Xu, Y. Sun, M. Goldblum, T. Goldstein, and F. Huang, “Exploring and exploiting decision boundary dynamics for adversarial robustness,” in International Conference on Learning Representations, 2023. [Online]. Available: https://arxiv.org/abs/2302.03015
  21. T. Pang, X. Yang, Y. Dong, H. Su, and J. Zhu, “Bag of tricks for adversarial training,” arXiv preprint arXiv:2010.00467, 2020.
  22. C. Dong, L. Liu, and J. Shang, “Label noise in adversarial training: A novel perspective to study robust overfitting,” Advances in Neural Information Processing Systems, vol. 35, pp. 17 556–17 567, 2022.
  23. M. Andriushchenko and N. Flammarion, “Understanding and improving fast adversarial training,” Advances in Neural Information Processing Systems, vol. 33, pp. 16 048–16 059, 2020.
  24. Y. Zhang, G. Zhang, P. Khanduri, M. Hong, S. Chang, and S. Liu, “Revisiting and advancing fast adversarial training through the lens of bi-level optimization,” in International Conference on Machine Learning.   PMLR, 2022, pp. 26 693–26 712.
  25. P. Nakkiran, G. Kaplun, Y. Bansal, T. Yang, B. Barak, and I. Sutskever, “Deep double descent: Where bigger models and more data hurt,” Journal of Statistical Mechanics: Theory and Experiment, vol. 2021, no. 12, p. 124003, 2021.
  26. B. Li, S. Wang, S. Jana, and L. Carin, “Towards understanding fast adversarial training,” arXiv preprint arXiv:2006.03089, 2020.
  27. C. Liu, M. Salzmann, T. Lin, R. Tomioka, and S. Süsstrunk, “On the loss landscape of adversarial training: Identifying challenges and how to overcome them,” Advances in Neural Information Processing Systems, vol. 33, pp. 21 476–21 487, 2020.
  28. F. Croce and M. Hein, “Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks,” in International conference on machine learning.   PMLR, 2020, pp. 2206–2216.
  29. F. Liao, M. Liang, Y. Dong, T. Pang, X. Hu, and J. Zhu, “Defense against adversarial attacks using high-level representation guided denoiser,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 1778–1787.
  30. X. Jiao, Y. Liu, J. Gao, X. Chu, R. Liu, and X. Fan, “Pearl: Preprocessing enhanced adversarial robust learning of image deraining for semantic segmentation,” ACM Multimedia, 2023.
  31. K. Dvijotham, S. Gowal, R. Stanforth, R. Arandjelovic, B. O’Donoghue, J. Uesato, and P. Kohli, “Training verified learners with learned verifiers,” arXiv preprint arXiv:1805.10265, 2018.
  32. E. Wong and Z. Kolter, “Provable defenses against adversarial examples via the convex outer adversarial polytope,” in International conference on machine learning.   PMLR, 2018, pp. 5286–5295.
  33. E. Wong, L. Rice, and J. Z. Kolter, “Fast is better than free: Revisiting adversarial training,” arXiv preprint arXiv:2001.03994, 2020.
  34. Y. Wang, D. Zou, J. Yi, J. Bailey, X. Ma, and Q. Gu, “Improving adversarial robustness requires revisiting misclassified examples,” in International conference on learning representations, 2019.
  35. J. Zhang, X. Xu, B. Han, G. Niu, L. Cui, M. Sugiyama, and M. Kankanhalli, “Attacks which do not kill training make adversarial learning stronger,” in International conference on machine learning.   PMLR, 2020, pp. 11 278–11 287.
  36. N. Carlini, F. Tramer, K. D. Dvijotham, L. Rice, M. Sun, and J. Z. Kolter, “(certified!!) adversarial robustness for free!” arXiv preprint arXiv:2206.10550, 2022.
  37. F. Croce, M. Andriushchenko, V. Sehwag, E. Debenedetti, N. Flammarion, M. Chiang, P. Mittal, and M. Hein, “Robustbench: a standardized adversarial robustness benchmark,” in Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track, 2021. [Online]. Available: https://openreview.net/forum?id=SSKZPJCt7B
  38. L. Rice, E. Wong, and Z. Kolter, “Overfitting in adversarially robust deep learning,” in International Conference on Machine Learning.   PMLR, 2020, pp. 8093–8104.
  39. H. Kim, W. Lee, and J. Lee, “Understanding catastrophic overfitting in single-step adversarial training,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no. 9, 2021, pp. 8119–8127.
  40. Y. Dong, K. Xu, X. Yang, T. Pang, Z. Deng, H. Su, and J. Zhu, “Exploring memorization in adversarial training,” arXiv preprint arXiv:2106.01606, 2021.
  41. T. Pang, M. Lin, X. Yang, J. Zhu, and S. Yan, “Robustness and accuracy could be reconcilable by (proper) definition,” in International Conference on Machine Learning.   PMLR, 2022, pp. 17 258–17 277.
  42. S. Jandial, A. Chopra, M. Sarkar, P. Gupta, B. Krishnamurthy, and V. Balasubramanian, “Retrospective loss: Looking back to improve training of deep neural networks,” in Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2020, pp. 1123–1131.
  43. P. Izmailov, D. Podoprikhin, T. Garipov, D. Vetrov, and A. G. Wilson, “Averaging weights leads to wider optima and better generalization,” arXiv preprint arXiv:1803.05407, 2018.
  44. S. Gowal, C. Qin, J. Uesato, T. Mann, and P. Kohli, “Uncovering the limits of adversarial training against norm-bounded adversarial examples,” arXiv preprint arXiv:2010.03593, 2020.
  45. B. Athiwaratkun, M. Finzi, P. Izmailov, and A. G. Wilson, “There are many consistent explanations of unlabeled data: Why you should average,” arXiv preprint arXiv:1806.05594, 2018.
  46. G. Yang, T. Zhang, P. Kirichenko, J. Bai, A. G. Wilson, and C. De Sa, “Swalp: Stochastic weight averaging in low precision training,” in International Conference on Machine Learning.   PMLR, 2019, pp. 7015–7024.
  47. A. Chan, Y. Tay, and Y.-S. Ong, “What it thinks is important is important: Robustness transfers through input gradients,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 332–341.
  48. H. Li, “Exploring knowledge distillation of deep neural nets for efficient hardware solutions,” CS230 Report, 2018.
  49. S. Kullback and R. A. Leibler, “On information and sufficiency,” The annals of mathematical statistics, vol. 22, no. 1, pp. 79–86, 1951.
  50. I. Sutskever, J. Martens, G. Dahl, and G. Hinton, “On the importance of initialization and momentum in deep learning,” in International conference on machine learning.   PMLR, 2013, pp. 1139–1147.
  51. A. Krizhevsky, G. Hinton et al., “Learning multiple layers of features from tiny images,” 2009.
  52. J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “Imagenet: A large-scale hierarchical image database,” in 2009 IEEE conference on computer vision and pattern recognition.   Ieee, 2009, pp. 248–255.
  53. K. He, X. Zhang, S. Ren, and J. Sun, “Identity mappings in deep residual networks,” in Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part IV 14.   Springer, 2016, pp. 630–645.
  54. S. Zagoruyko and N. Komodakis, “Wide residual networks,” arXiv preprint arXiv:1605.07146, 2016.
  55. H. Li, “Exploring knowledge distillation of deep neural.”
  56. P.-T. Jiang, C.-B. Zhang, Q. Hou, M.-M. Cheng, and Y. Wei, “Layercam: Exploring hierarchical class activation maps for localization,” IEEE Transactions on Image Processing, vol. 30, pp. 5875–5888, 2021.
  57. M. T. Ribeiro, S. Singh, and C. Guestrin, “” why should i trust you?” explaining the predictions of any classifier,” in Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, 2016, pp. 1135–1144.

Summary

We haven't generated a summary for this paper yet.

Whiteboard

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.