Defense without Forgetting: Continual Adversarial Defense with Anisotropic & Isotropic Pseudo Replay (2404.01828v1)
Abstract: Deep neural networks have demonstrated susceptibility to adversarial attacks. Adversarial defense techniques often focus on one-shot setting to maintain robustness against attack. However, new attacks can emerge in sequences in real-world deployment scenarios. As a result, it is crucial for a defense model to constantly adapt to new attacks, but the adaptation process can lead to catastrophic forgetting of previously defended against attacks. In this paper, we discuss for the first time the concept of continual adversarial defense under a sequence of attacks, and propose a lifelong defense baseline called Anisotropic & Isotropic Replay (AIR), which offers three advantages: (1) Isotropic replay ensures model consistency in the neighborhood distribution of new data, indirectly aligning the output preference between old and new tasks. (2) Anisotropic replay enables the model to learn a compromise data manifold with fresh mixed semantics for further replay constraints and potential future attacks. (3) A straightforward regularizer mitigates the 'plasticity-stability' trade-off by aligning model output between new and old tasks. Experiment results demonstrate that AIR can approximate or even exceed the empirical performance upper bounds achieved by Joint Training.
- Expert gate: Lifelong learning with a network of experts. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 3366–3375, 2017.
- Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. In International Conference on Machine Learning, pages 274–283. PMLR, 2018a.
- Synthesizing robust adversarial examples. In International Conference on Machine Learning, pages 284–293. PMLR, 2018b.
- Training ensembles to detect adversarial examples. arXiv preprint arXiv:1712.04006, 2017.
- Decision-based adversarial attacks: Reliable attacks against black-box machine learning models. arXiv preprint arXiv:1712.04248, 2017.
- Adversarial patch. arXiv preprint arXiv:1712.09665, 2017.
- Towards evaluating the robustness of neural networks. In 2017 IEEE Symposium on Security and Privacy (SP), pages 39–57. IEEE, 2017a.
- Towards evaluating the robustness of neural networks. In 2017 IEEE Symposium on Security and Privacy (SP), pages 39–57. IEEE, 2017b.
- Continual learning with tiny episodic memories. In Workshop on Multi-Task and Lifelong Reinforcement Learning, 2019.
- Zoo: Zeroth order optimization based black-box attacks to deep neural networks without training substitute models. In Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security, pages 15–26, 2017.
- Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. In International Conference on Machine Learning, pages 2206–2216. PMLR, 2020.
- A continual learning survey: Defying forgetting in classification tasks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(7):3366–3385, 2021.
- Boosting adversarial attacks with momentum. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 9185–9193, 2018.
- Evading defenses to transferable adversarial examples by translation-invariant attacks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4312–4321, 2019.
- Adversarially robust distillation. In Proceedings of the AAAI Conference on Artificial Intelligence, pages 3996–4003, 2020.
- Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572, 2014.
- What do adversarially trained neural networks focus: A fourier domain-based study. arXiv preprint arXiv:2203.08739, 2022.
- Las-at: adversarial training with learnable attack strategy. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13398–13408, 2022.
- Less-forgetting learning in deep neural networks. arXiv preprint arXiv:1607.00122, 2016.
- Achieving a better stability-plasticity trade-off via auxiliary networks in continual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11930–11939, 2023.
- Overcoming catastrophic forgetting in neural networks. Proceedings of the national academy of sciences, 114(13):3521–3526, 2017.
- Adversarial machine learning at scale. arXiv preprint arXiv:1611.01236, 2016.
- Adversarial examples in the physical world. In Artificial Intelligence Safety and Security, pages 99–112. Chapman and Hall/CRC, 2018.
- Learning without forgetting. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(12):2935–2947, 2017.
- Defense against adversarial attacks using high-level representation guided denoiser. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1778–1787, 2018.
- Nesterov accelerated gradient and scale invariance for adversarial attacks. arXiv preprint arXiv:1908.06281, 2019.
- Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083, 2017.
- A frequency perspective of adversarial robustness. arXiv preprint arXiv:2111.00861, 2021.
- Class-incremental learning: survey and performance evaluation on image classification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(5):5513–5533, 2022.
- Catastrophic interference in connectionist networks: The sequential learning problem. In Psychology of Learning and Motivation, pages 109–165. Elsevier, 1989.
- Deepfool: a simple and accurate method to fool deep neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2574–2582, 2016.
- Deeply supervised discriminative learning for adversarial defense. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(9):3154–3166, 2020.
- Bag of tricks for adversarial training. arXiv preprint arXiv:2010.00467, 2020.
- Distillation as a defense to adversarial perturbations against deep neural networks. In 2016 IEEE Symposium on Security and Privacy (SP), pages 582–597. IEEE, 2016.
- Continual lifelong learning with neural networks: A review. Neural Networks, 113:54–71, 2019.
- Encoder based lifelong learning. In Proceedings of the IEEE International Conference on Computer Vision, pages 1320–1328, 2017.
- Anthony Robins. Catastrophic forgetting, rehearsal and pseudorehearsal. Connection Science, 7(2):123–146, 1995.
- Experience replay for continual learning. Advances in Neural Information Processing Systems, 32, 2019.
- Improving the adversarial robustness and interpretability of deep neural networks by regularizing their input gradients. In Proceedings of the AAAI Conference on Artificial Intelligence, 2018.
- Ape-gan: Adversarial perturbation elimination with gan. arXiv preprint arXiv:1707.05474, 2017.
- Online adversarial purification based on self-supervision. arXiv preprint arXiv:2101.09387, 2021.
- Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research, 15(1):1929, 2014.
- One pixel attack for fooling deep neural networks. IEEE Transactions on Evolutionary Computation, 23(5):828–841, 2019.
- Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199, 2013.
- Agkd-bml: Defense against adversarial attack by attention guided knowledge distillation and bi-directional metric learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 7658–7667, 2021.
- Improving adversarial robustness requires revisiting misclassified examples. In International Conference on Learning Representations, 2019.
- R-drop: Regularized dropout for neural networks. Advances in Neural Information Processing Systems, 34:10890–10905, 2021.
- Mitigating adversarial effects through randomization. arXiv preprint arXiv:1711.01991, 2017.
- Reinforced continual learning. Advances in Neural Information Processing Systems, 31, 2018.
- Adaptive test-time defense with the manifold hypothesis. arXiv preprint arXiv:2210.14404, 2022.
- Wide residual networks. arXiv preprint arXiv:1605.07146, 2016.
- mixup: Beyond empirical risk minimization. arXiv preprint arXiv:1710.09412, 2017.
- Theoretically principled trade-off between robustness and accuracy. In International Conference on Machine Learning, pages 7472–7482. PMLR, 2019.
- Improving the robustness of deep neural networks via stability training. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 4480–4488, 2016.
- Reliable adversarial distillation with unreliable teachers. arXiv preprint arXiv:2106.04928, 2021.
- Revisiting adversarial robustness distillation: Robust soft labels make student better. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 16443–16452, 2021.