Adversarial Purification with the Manifold Hypothesis
Abstract: In this work, we formulate a novel framework for adversarial robustness using the manifold hypothesis. This framework provides sufficient conditions for defending against adversarial examples. We develop an adversarial purification method with this framework. Our method combines manifold learning with variational inference to provide adversarial robustness without the need for expensive adversarial training. Experimentally, our approach can provide adversarial robustness even if attackers are aware of the existence of the defense. In addition, our method can also serve as a test-time defense mechanism for variational autoencoders.
- Combating Adversaries with Anti-adversaries. In AAAI, 5992–6000. AAAI Press.
- Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples. In ICML, volume 80 of Proceedings of Machine Learning Research, 274–283. PMLR.
- Measuring Neural Net Robustness with Constraints. In NIPS, 2613–2621.
- Towards Evaluating the Robustness of Neural Networks. In IEEE Symposium on Security and Privacy, 39–57. IEEE Computer Society.
- RayS: A Ray Searching Method for Hard-label Adversarial Attack. In KDD, 1739–1747. ACM.
- Evaluating the Adversarial Robustness of Adaptive Test-time Defenses. In ICML, volume 162 of Proceedings of Machine Learning Research, 4421–4435. PMLR.
- Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. In ICML, volume 119 of Proceedings of Machine Learning Research, 2206–2216. PMLR.
- Reverse Engineering of Imperceptible Adversarial Image Perturbations. In ICLR. OpenReview.net.
- Explaining and Harnessing Adversarial Examples. In ICLR (Poster).
- Your classifier is secretly an energy based model and you should treat it like one. In ICLR. OpenReview.net.
- Deep Residual Learning for Image Recognition. In CVPR, 770–778. IEEE Computer Society.
- Stochastic Security: Adversarial Defense Using Long-Run Dynamics of Energy-Based Models. In ICLR. OpenReview.net.
- Revisiting Residual Networks for Adversarial Robustness: An Architectural Perspective. arXiv preprint arXiv:2212.11005.
- Improving Adversarial Defense with Self-supervised Test-time Fine-tuning.
- PuVAE: A Variational Autoencoder to Purify Adversarial Examples. In IEEE Access, volume 7, 126582–126593.
- Kim, H. 2020. Torchattacks: A pytorch repository for adversarial attacks. arXiv preprint arXiv:2010.01950.
- Adam: A Method for Stochastic Optimization. In ICLR (Poster).
- Auto-Encoding Variational Bayes. In ICLR.
- Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario.
- MNIST handwritten digit database. ATT Labs [Online]. Available: http://yann.lecun.com/exdb/mnist, 2.
- Robust Evaluation of Diffusion-Based Adversarial Purification. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 134–144.
- Globally-Robust Neural Networks. In ICML, volume 139 of Proceedings of Machine Learning Research, 6212–6222. PMLR.
- Dual Manifold Adversarial Robustness: Defense against Lp and non-Lp Adversarial Attacks. In NeurIPS.
- Deep Learning Face Attributes in the Wild. In Proceedings of International Conference on Computer Vision (ICCV).
- Towards Deep Learning Models Resistant to Adversarial Attacks. In ICLR (Poster). OpenReview.net.
- Adversarial Attacks are Reversible with Natural Supervision. In ICCV, 641–651. IEEE.
- MagNet: A Two-Pronged Defense against Adversarial Examples. In CCS, 135–147. ACM.
- DAD: Data-free Adversarial Defense at Test Time. In IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2022, Waikoloa, HI, USA, January 3-8, 2022, 3788–3797. IEEE.
- Reading digits in natural images with unsupervised feature learning. In NIPS Workshop on Deep Learning and Unsupervised Feature Learning.
- Diffusion Models for Adversarial Purification. In ICML, volume 162 of Proceedings of Machine Learning Research, 16805–16827. PMLR.
- Robustness and Accuracy Could Be Reconcilable by (Proper) Definition. In ICML, volume 162 of Proceedings of Machine Learning Research, 17258–17277. PMLR.
- Bag of Tricks for Adversarial Training. In ICLR. OpenReview.net.
- On-manifold Adversarial Data Augmentation Improves Uncertainty Calibration. In ICPR, 8029–8036. IEEE.
- Enhancing Adversarial Robustness via Test-time Transformation Ensembling. In ICCVW, 81–91. IEEE.
- Foolbox: A Python toolbox to benchmark the robustness of machine learning models. In Reliable Machine Learning in the Wild Workshop, 34th International Conference on Machine Learning.
- Fixing Data Augmentation to Improve Adversarial Robustness. CoRR, abs/2103.01946.
- Overfitting in adversarially robust deep learning. In ICML, volume 119 of Proceedings of Machine Learning Research, 8093–8104. PMLR.
- Defense-GAN: Protecting Classifiers Against Adversarial Attacks Using Generative Models. In ICLR (Poster). OpenReview.net.
- Robust Learning Meets Generative Models: Can Proxy Distributions Improve Adversarial Robustness? In ICLR. OpenReview.net.
- Online Adversarial Purification based on Self-supervised Learning. In ICLR. OpenReview.net.
- PixelDefend: Leveraging Generative Models to Understand and Defend against Adversarial Examples. In ICLR (Poster). OpenReview.net.
- Disentangling Adversarial Robustness and Generalization. In CVPR, 6976–6987. Computer Vision Foundation / IEEE.
- Intriguing properties of neural networks. In ICLR (Poster).
- Robustness May Be at Odds with Accuracy. In ICLR (Poster). OpenReview.net.
- Fighting gradients with gradients: Dynamic defenses against adversarial attacks. arXiv preprint arXiv:2105.08714.
- Better Diffusion Models Further Improve Adversarial Training. In Proceedings of the 40th International Conference on Machine Learning, volume 202 of Proceedings of Machine Learning Research, 36246–36263. PMLR.
- Improving VAEs’ Robustness to Adversarial Attack. In ICLR. OpenReview.net.
- Provable Defenses against Adversarial Examples via the Convex Outer Adversarial Polytope. In ICML, volume 80 of Proceedings of Machine Learning Research, 5283–5292. PMLR.
- DensePure: Understanding Diffusion Models towards Adversarial Robustness. CoRR, abs/2211.00322.
- Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747.
- Defending against adversarial attacks using spherical sampling-based variational auto-encoder. Neurocomputing, 478: 1–10.
- Adversarial Purification with Score-based Generative Models. In ICML, volume 139 of Proceedings of Machine Learning Research, 12062–12072. PMLR.
- Wide Residual Networks. In BMVC. BMVA Press.
- Theoretically Principled Trade-off between Robustness and Accuracy. In ICML, volume 97 of Proceedings of Machine Learning Research, 7472–7482. PMLR.
- Maximum-Entropy Adversarial Data Augmentation for Improved Generalization and Robustness. In NeurIPS.
- Manifold Projection for Adversarial Defense on Face Recognition. In ECCV (30), volume 12375 of Lecture Notes in Computer Science, 288–305. Springer.
- Zhou, Y. 2022. Rethinking Reconstruction Autoencoder-Based Out-of-Distribution Detection. In CVPR, 7369–7377. IEEE.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.