Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 152 tok/s
Gemini 2.5 Pro 41 tok/s Pro
GPT-5 Medium 22 tok/s Pro
GPT-5 High 24 tok/s Pro
GPT-4o 94 tok/s Pro
Kimi K2 212 tok/s Pro
GPT OSS 120B 430 tok/s Pro
Claude Sonnet 4.5 36 tok/s Pro
2000 character limit reached

AFD: Mitigating Feature Gap for Adversarial Robustness by Feature Disentanglement (2401.14707v2)

Published 26 Jan 2024 in cs.CV, cs.AI, and cs.LG

Abstract: Adversarial fine-tuning methods enhance adversarial robustness via fine-tuning the pre-trained model in an adversarial training manner. However, we identify that some specific latent features of adversarial samples are confused by adversarial perturbation and lead to an unexpectedly increasing gap between features in the last hidden layer of natural and adversarial samples. To address this issue, we propose a disentanglement-based approach to explicitly model and further remove the specific latent features. We introduce a feature disentangler to separate out the specific latent features from the features of the adversarial samples, thereby boosting robustness by eliminating the specific latent features. Besides, we align clean features in the pre-trained model with features of adversarial samples in the fine-tuned model, to benefit from the intrinsic features of natural samples. Empirical evaluations on three benchmark datasets demonstrate that our approach surpasses existing adversarial fine-tuning methods and adversarial training baselines.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (41)
  1. Robustness against gradient based attacks through cost effective network fine-tuning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, pp.  28–37, June 2023.
  2. Feature purification: How adversarial training performs robust deep learning. In 2021 IEEE 62nd Annual Symposium on Foundations of Computer Science (FOCS), pp.  977–988. IEEE, 2022.
  3. Understanding and improving fast adversarial training. ArXiv, abs/2007.02617, 2020. URL https://api.semanticscholar.org/CorpusID:220363591.
  4. Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. In International Conference on Machine Learning, 2020.
  5. Adversarial robustness against multiple and single l⁢_⁢p𝑙_𝑝l\_pitalic_l _ italic_p-threat models via quick fine-tuning of robust classifiers. In International Conference on Machine Learning, pp.  4436–4454. PMLR, 2022.
  6. Imagenet: A large-scale hierarchical image database. 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp.  248–255, 2009.
  7. Exploring memorization in adversarial training. arXiv preprint arXiv:2106.01606, 2021.
  8. Adversarially robust distillation. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pp.  3996–4003, 2020.
  9. Explaining and harnessing adversarial examples. CoRR, abs/1412.6572, 2014.
  10. Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.  770–778, 2016. doi: 10.1109/CVPR.2016.90.
  11. Robust adversarial attacks detection based on explainable deep reinforcement learning for uav guidance and planning. IEEE Transactions on Intelligent Vehicles, 2023.
  12. Self-adaptive training: beyond empirical risk minimization. Advances in neural information processing systems, 33:19365–19376, 2020.
  13. Fast adversarial training with adaptive step size. IEEE Transactions on Image Processing, 2023.
  14. A simple fine-tuning is all you need: Towards robust deep learning via adversarial fine-tuning. arXiv preprint arXiv:2012.13628, 2020.
  15. Las-at: adversarial training with learnable attack strategy. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  13398–13408, 2022.
  16. Enhancing adversarial training with second-order statistics of weights. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  15273–15283, 2022.
  17. Understanding catastrophic overfitting in single-step adversarial training. In AAAI Conference on Artificial Intelligence, 2020. URL https://api.semanticscholar.org/CorpusID:222133879.
  18. Krizhevsky, A. Learning multiple layers of features from tiny images. 2009.
  19. Robust evaluation of diffusion-based adversarial purification. arXiv preprint arXiv:2303.09051, 2023.
  20. Twins: A fine-tuning framework for improved transferability of adversarial robustness and generalization. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
  21. Learning to generate noise for multi-attack robustness. In International Conference on Machine Learning, pp.  7279–7289. PMLR, 2021.
  22. Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083, 2017.
  23. Adversarial robustness against the union of multiple perturbation models. In International Conference on Machine Learning, pp.  6640–6650. PMLR, 2020.
  24. Robustness via curvature regularization, and vice versa. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.  9070–9078, 2018. URL https://api.semanticscholar.org/CorpusID:53737378.
  25. Diffusion models for adversarial purification. arXiv preprint arXiv:2205.07460, 2022.
  26. Towards robust detection of adversarial examples. Advances in neural information processing systems, 31, 2018.
  27. Adversarial training for free! In Neural Information Processing Systems, 2019. URL https://api.semanticscholar.org/CorpusID:139102395.
  28. Better robustness by more coverage: Adversarial training with mixup augmentation for robust fine-tuning. arXiv preprint arXiv:2012.15699, 2020.
  29. A critical revisit of adversarial robustness in 3d point cloud recognition with diffusion-driven purification. In International Conference on Machine Learning, pp.  33100–33114. PMLR, 2023.
  30. Adversarial finetuning with latent representation constraint to mitigate accuracy-robustness tradeoff. arXiv preprint arXiv:2308.16454, 2023.
  31. Adversarial training and robustness for multiple perturbations. Advances in neural information processing systems, 32, 2019.
  32. Visualizing data using t-sne. Journal of machine learning research, 9(11), 2008.
  33. Improving adversarial robustness requires revisiting misclassified examples. In International Conference on Learning Representations, 2020.
  34. Fast is better than free: Revisiting adversarial training. ArXiv, abs/2001.03994, 2020. URL https://api.semanticscholar.org/CorpusID:210164926.
  35. Adversarial weight perturbation helps robust generalization. Advances in Neural Information Processing Systems, 33:2958–2969, 2020.
  36. Adversarial purification with score-based generative models. In International Conference on Machine Learning, pp.  12062–12072. PMLR, 2021.
  37. Wide residual networks. ArXiv, abs/1605.07146, 2016.
  38. Theoretically principled trade-off between robustness and accuracy. In International conference on machine learning, pp.  7472–7482. PMLR, 2019.
  39. Robust detection of adversarial attacks by modeling the intrinsic properties of deep neural networks. Advances in Neural Information Processing Systems, 31, 2018.
  40. Modeling adversarial noise for adversarial training. In International Conference on Machine Learning, pp.  27353–27366. PMLR, 2022.
  41. Improving generalization of adversarial training via robust critical fine-tuning. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp.  4424–4434, October 2023.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Questions

We haven't generated a list of open questions mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 1 tweet and received 0 likes.

Upgrade to Pro to view all of the tweets about this paper: