Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Invariance-powered Trustworthy Defense via Remove Then Restore (2402.00304v1)

Published 1 Feb 2024 in cs.CV

Abstract: Adversarial attacks pose a challenge to the deployment of deep neural networks (DNNs), while previous defense models overlook the generalization to various attacks. Inspired by targeted therapies for cancer, we view adversarial samples as local lesions of natural benign samples, because a key finding is that salient attack in an adversarial sample dominates the attacking process, while trivial attack unexpectedly provides trustworthy evidence for obtaining generalizable robustness. Based on this finding, a Pixel Surgery and Semantic Regeneration (PSSR) model following the targeted therapy mechanism is developed, which has three merits: 1) To remove the salient attack, a score-based Pixel Surgery module is proposed, which retains the trivial attack as a kind of invariance information. 2) To restore the discriminative content, a Semantic Regeneration module based on a conditional alignment extrapolator is proposed, which achieves pixel and semantic consistency. 3) To further harmonize robustness and accuracy, an intractable problem, a self-augmentation regularizer with adversarial R-drop is designed. Experiments on numerous benchmarks show the superiority of PSSR.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (55)
  1. Square attack: a query-efficient black-box adversarial attack via random search. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXIII, pages 484–501. Springer, 2020.
  2. Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. In Proceedings of the 35th International Conference on Machine Learning, ICML 2018, 2018.
  3. How to explain individual classification decisions. The Journal of Machine Learning Research, 11:1803–1831, 2010.
  4. Training ensembles to detect adversarial examples. arXiv preprint arXiv:1712.04006, 2017.
  5. Adversarial metric attack and defense for person re-identification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(6):2119–2126, 2020.
  6. Decision-based adversarial attacks: Reliable attacks against black-box machine learning models. arXiv preprint arXiv:1712.04248, 2017.
  7. Towards evaluating the robustness of neural networks. In 2017 ieee symposium on security and privacy (sp), pages 39–57. Ieee, 2017.
  8. Zoo: Zeroth order optimization based black-box attacks to deep neural networks without training substitute models. In Proceedings of the 10th ACM workshop on artificial intelligence and security, pages 15–26, 2017.
  9. Minimally distorted adversarial examples with a fast adaptive boundary attack. In International Conference on Machine Learning, pages 2196–2205. PMLR, 2020a.
  10. Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. In International conference on machine learning, pages 2206–2216. PMLR, 2020b.
  11. Advertorch v0. 1: An adversarial robustness toolbox based on pytorch. arXiv preprint arXiv:1902.07623, 2019.
  12. Boosting adversarial attacks with momentum. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 9185–9193, 2018.
  13. Evading defenses to transferable adversarial examples by translation-invariant attacks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4312–4321, 2019.
  14. Visualizing higher-layer features of a deep network. University of Montreal, 1341(3):1, 2009.
  15. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572, 2014.
  16. Towards deep neural network architectures robust to adversarial examples. arXiv preprint arXiv:1412.5068, 2014.
  17. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
  18. What do adversarially trained neural networks focus: A fourier domain-based study. arXiv preprint arXiv:2203.08739, 2022.
  19. Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4700–4708, 2017.
  20. Understanding robustness and generalization of artificial neural networks through fourier masks. Frontiers in Artificial Intelligence, 5, 2022.
  21. Adversarial machine learning at scale. arXiv preprint arXiv:1611.01236, 2016.
  22. Defense against adversarial attacks using high-level representation guided denoiser. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1778–1787, 2018.
  23. Dropout with expectation-linear regularization. arXiv preprint arXiv:1609.08017, 2016.
  24. Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083, 2017.
  25. A frequency perspective of adversarial robustness. arXiv preprint arXiv:2111.00861, 2021.
  26. Magnet: a two-pronged defense against adversarial examples. In Proceedings of the 2017 ACM SIGSAC conference on computer and communications security, pages 135–147, 2017.
  27. Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784, 2014.
  28. Deepfool: a simple and accurate method to fool deep neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2574–2582, 2016.
  29. Deeply supervised discriminative learning for adversarial defense. IEEE transactions on pattern analysis and machine intelligence, 43(9):3154–3166, 2020.
  30. Lpf-defense: 3d adversarial defense based on frequency analysis. arXiv preprint arXiv:2202.11287, 2022.
  31. A self-supervised approach for adversarial robustness. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 262–271, 2020.
  32. Bag of tricks for adversarial training. arXiv preprint arXiv:2010.00467, 2020.
  33. Distillation as a defense to adversarial perturbations against deep neural networks. In 2016 IEEE symposium on security and privacy (SP), pages 582–597. IEEE, 2016.
  34. Certified defenses against adversarial examples. arXiv preprint arXiv:1801.09344, 2018.
  35. Overfitting in adversarially robust deep learning. In International Conference on Machine Learning, pages 8093–8104. PMLR, 2020.
  36. Improving the adversarial robustness and interpretability of deep neural networks by regularizing their input gradients. In Proceedings of the AAAI Conference on Artificial Intelligence, 2018.
  37. Defense-gan: Protecting classifiers against adversarial attacks using generative models. arXiv preprint arXiv:1805.06605, 2018.
  38. Online adversarial purification based on self-supervision. arXiv preprint arXiv:2101.09387, 2021.
  39. Deep inside convolutional networks: Visualising image classification models and saliency maps. arXiv preprint arXiv:1312.6034, 2013.
  40. Pixeldefend: Leveraging generative models to understand and defend against adversarial examples. arXiv preprint arXiv:1710.10766, 2017.
  41. Dropout: a simple way to prevent neural networks from overfitting. The journal of machine learning research, 15(1):1929–1958, 2014.
  42. One pixel attack for fooling deep neural networks. IEEE Transactions on Evolutionary Computation, 23(5):828–841, 2019.
  43. Are labels required for improving adversarial robustness? arXiv preprint arXiv:1905.13725, 2019.
  44. Laurens Van der Maaten and Geoffrey Hinton. Visualizing data using t-sne. Journal of machine learning research, 9(11), 2008.
  45. Extracting and composing robust features with denoising autoencoders. In Proceedings of the 25th international conference on Machine learning, pages 1096–1103, 2008.
  46. Agkd-bml: Defense against adversarial attack by attention guided knowledge distillation and bi-directional metric learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 7658–7667, 2021.
  47. Towards frequency-based explanation for robust cnn. arXiv preprint arXiv:2005.03141, 2020.
  48. R-drop: Regularized dropout for neural networks. Advances in Neural Information Processing Systems, 34:10890–10905, 2021.
  49. Mitigating adversarial effects through randomization. arXiv preprint arXiv:1711.01991, 2017.
  50. Improving transferability of adversarial examples with input diversity. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2730–2739, 2019.
  51. How transferable are features in deep neural networks? Advances in neural information processing systems, 27, 2014.
  52. Theoretically principled trade-off between robustness and accuracy. In International conference on machine learning, pages 7472–7482. PMLR, 2019.
  53. Improving the robustness of deep neural networks via stability training. In Proceedings of the ieee conference on computer vision and pattern recognition, pages 4480–4488, 2016.
  54. Towards defending against adversarial examples via attack-invariant features. In International Conference on Machine Learning, pages 12835–12845. PMLR, 2021.
  55. Fraternal dropout. arXiv preprint arXiv:1711.00066, 2017.

Summary

We haven't generated a summary for this paper yet.