Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
184 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Improving White-box Robustness of Pre-processing Defenses via Joint Adversarial Training (2106.05453v2)

Published 10 Jun 2021 in cs.CV

Abstract: Deep neural networks (DNNs) are vulnerable to adversarial noise. A range of adversarial defense techniques have been proposed to mitigate the interference of adversarial noise, among which the input pre-processing methods are scalable and show great potential to safeguard DNNs. However, pre-processing methods may suffer from the robustness degradation effect, in which the defense reduces rather than improving the adversarial robustness of a target model in a white-box setting. A potential cause of this negative effect is that adversarial training examples are static and independent to the pre-processing model. To solve this problem, we investigate the influence of full adversarial examples which are crafted against the full model, and find they indeed have a positive impact on the robustness of defenses. Furthermore, we find that simply changing the adversarial training examples in pre-processing methods does not completely alleviate the robustness degradation effect. This is due to the adversarial risk of the pre-processed model being neglected, which is another cause of the robustness degradation effect. Motivated by above analyses, we propose a method called Joint Adversarial Training based Pre-processing (JATP) defense. Specifically, we formulate a feature similarity based adversarial risk for the pre-processing model by using full adversarial examples found in a feature space. Unlike standard adversarial training, we only update the pre-processing model, which prompts us to introduce a pixel-wise loss to improve its cross-model transferability. We then conduct a joint adversarial training on the pre-processing model to minimize this overall risk. Empirical results show that our method could effectively mitigate the robustness degradation effect across different target models in comparison to previous state-of-the-art approaches.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (59)
  1. Scalable training of l 1-regularized log-linear models. In Proceedings of the 24th international conference on Machine learning, pages 33–40, 2007.
  2. Wasserstein generative adversarial networks. In International conference on machine learning, pages 214–223. PMLR, 2017.
  3. On the robustness of the cvpr 2018 white-box adversarial example defenses. arXiv preprint arXiv:1804.03286, 2018.
  4. Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. In Proceedings of the 35th International Conference on Machine Learning, 2018.
  5. Improving vision transformers by revisiting high-frequency components. In European Conference on Computer Vision, pages 1–18. Springer, 2022.
  6. Magnet and" efficient defenses against adversarial attacks" are not robust to adversarial examples. arXiv preprint arXiv:1711.08478, 2017a.
  7. Towards evaluating the robustness of neural networks. In 2017 Ieee Symposium on Security and Privacy (sp), pages 39–57. IEEE, 2017b.
  8. Robust overfitting may be mitigated by properly learned smoothening. In International Conference on Learning Representations, volume 1, 2021.
  9. On breaking deep generative model-based defenses and beyond. In International Conference on Machine Learning, pages 1736–1745. PMLR, 2020.
  10. Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. In International Conference on Machine Learning, pages 2206–2216. PMLR, 2020.
  11. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.
  12. On the sensitivity of adversarial robustness to input data distributions. In ICLR (Poster), 2019.
  13. Evading defenses to transferable adversarial examples by translation-invariant attacks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4312–4321, 2019.
  14. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020.
  15. Adversarial camouflage: Hiding physical-world attacks with natural styles. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1000–1008, 2020.
  16. Robust physical-world attacks on deep learning visual classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1625–1634, 2018.
  17. Detecting adversarial samples from artifacts. arXiv preprint arXiv:1703.00410, 2017.
  18. Resisting adversarial attacks using gaussian mixture variational autoencoders. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, pages 541–548, 2019.
  19. Adversarial attacks on variational autoencoders. arXiv preprint arXiv:1806.04646, 2018.
  20. Multi-feature canonical correlation analysis for face photo-sketch image retrieval. In Proceedings of the 21st ACM international conference on Multimedia, pages 617–620, 2013.
  21. Explaining and harnessing adversarial examples. In International Conference on Learning Representations, 2015.
  22. Countering adversarial images using input transformations. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings. OpenReview.net, 2018.
  23. Deep residual learning for image recognition. In Conference on Computer Vision and Pattern Recognition, pages 770–778, 2016.
  24. Black-box adversarial attack with transferable model-based embedding. arXiv preprint arXiv:1911.07140, 2019.
  25. APE-GAN: adversarial perturbation elimination with GAN. In International Conference on Acoustics, Speech and Signal Processing, pages 3842–3846, 2019.
  26. Mask r-cnn. IEEE Transactions on Pattern Analysis & Machine Intelligence, PP:1–1, 2017.
  27. Adversarial examples for generative models. In 2018 ieee security and privacy workshops (spw), pages 36–42. IEEE, 2018.
  28. Learning multiple layers of features from tiny images. 2009.
  29. Adversarial machine learning at scale. arXiv preprint arXiv:1611.01236, 2016.
  30. Common feature discriminant analysis for matching infrared face images to optical face images. IEEE transactions on image processing, 23(6):2436–2445, 2014.
  31. Mutual component analysis for heterogeneous face recognition. ACM Transactions on Intelligent Systems and Technology (TIST), 7(3):1–23, 2016.
  32. Defense against adversarial attacks using high-level representation guided denoiser. In Conference on Computer Vision and Pattern Recognition, pages 1778–1787, 2018.
  33. Invert and defend: Model-based approximate inversion of generative adversarial networks for secure inference. arXiv preprint arXiv:1911.10291, 2019.
  34. Spatio-temporal embedding for statistical face recognition from video. In Computer Vision–ECCV 2006: 9th European Conference on Computer Vision, Graz, Austria, May 7-13, 2006. Proceedings, Part II 9, pages 374–388. Springer, 2006.
  35. Characterizing adversarial subspaces using local intrinsic dimensionality. In International Conference on Learning Representations, 2018.
  36. Towards deep learning models resistant to adversarial attacks. In 6th International Conference on Learning Representations, 2018.
  37. A self-supervised approach for adversarial robustness. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 262–271, 2020.
  38. Reading digits in natural images with unsupervised feature learning. 2011.
  39. Detecting and diagnosing adversarial images with class-conditional capsule reconstructions. arXiv preprint arXiv:1907.02957, 2019.
  40. Defense-gan: Protecting classifiers against adversarial attacks using generative models. arXiv preprint arXiv:1805.06605, 2018.
  41. Towards the first adversarially robust neural network model on mnist. arXiv preprint arXiv:1805.09190, 2018.
  42. Type i attack for generative models. In 2020 IEEE International Conference on Image Processing (ICIP), pages 593–597. IEEE, 2020.
  43. Sequence to sequence learning with neural networks. In Neural Information Processing Systems, pages 3104–3112, 2014.
  44. Intriguing properties of neural networks. In International Conference on Learning Representations, 2014.
  45. Video based face recognition using multiple classifiers. In Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. Proceedings., pages 345–349. IEEE, 2004.
  46. On adaptive attacks to adversarial example defenses. arXiv preprint arXiv:2002.08347, 2020.
  47. Triangle attack: A query-efficient decision-based adversarial attack. In European conference on computer vision, pages 156–174. Springer, 2022.
  48. Residual convolutional ctc networks for automatic speech recognition. arXiv preprint arXiv:1702.07793, 2017.
  49. On the convergence and robustness of adversarial training. In ICML, volume 1, page 2, 2019a.
  50. Improving adversarial robustness requires revisiting misclassified examples. In International Conference on Learning Representations, 2019b.
  51. Skip connections matter: On the transferability of adversarial examples generated with resnets. arXiv preprint arXiv:2002.05990, 2020a.
  52. Adversarial weight perturbation helps robust generalization. Advances in Neural Information Processing Systems, 33, 2020b.
  53. Stronger and faster wasserstein adversarial attacks. In Proceedings of the 37th International Conference on Machine Learning, volume 119, pages 10377–10387, 2020c.
  54. Defending against physically realizable attacks on image classification. In 8th International Conference on Learning Representations, 2020d.
  55. Improving transferability of adversarial examples with input diversity. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2730–2739, 2019.
  56. Adversarial t-shirt! evading person detectors in a physical world. In European Conference on Computer Vision, pages 665–681. Springer, 2020.
  57. Feature squeezing: Detecting adversarial examples in deep neural networks. arXiv preprint arXiv:1704.01155, 2017.
  58. Defense against adversarial attacks using feature scattering-based adversarial training. arXiv preprint arXiv:1907.10764, 2019.
  59. Theoretically principled trade-off between robustness and accuracy. In International Conference on Machine Learning, pages 7472–7482. PMLR, 2019.
Citations (3)

Summary

We haven't generated a summary for this paper yet.