Attacks in Adversarial Machine Learning: A Systematic Survey from the Life-cycle Perspective (2302.09457v2)
Abstract: Adversarial machine learning (AML) studies the adversarial phenomenon of machine learning, which may make inconsistent or unexpected predictions with humans. Some paradigms have been recently developed to explore this adversarial phenomenon occurring at different stages of a machine learning system, such as backdoor attack occurring at the pre-training, in-training and inference stage; weight attack occurring at the post-training, deployment and inference stage; adversarial attack occurring at the inference stage. However, although these adversarial paradigms share a common goal, their developments are almost independent, and there is still no big picture of AML. In this work, we aim to provide a unified perspective to the AML community to systematically review the overall progress of this field. We firstly provide a general definition about AML, and then propose a unified mathematical framework to covering existing attack paradigms. According to the proposed unified framework, we build a full taxonomy to systematically categorize and review existing representative methods for each paradigm. Besides, using this unified framework, it is easy to figure out the connections and differences among different attack paradigms, which may inspire future researchers to develop more advanced attack paradigms. Finally, to facilitate the viewing of the built taxonomy and the related literature in adversarial machine learning, we further provide a website, \ie, \url{http://adversarial-ml.com}, where the taxonomies and literature will be continuously updated.
- Turning your weakness into a strength: Watermarking deep neural networks by backdooring. In USENIX, pages 1615–1631, 2018.
- Are image-agnostic universal adversarial perturbations for face recognition difficult to detect? In BTAS. IEEE, 2018.
- How to flip a bit? In IOLTS, 2010.
- Discrete cosine transform. IEEE Transactions on Computers, 100(1):90–93, 1974.
- Threat of adversarial attacks on deep learning in computer vision: A survey. IEEE Access, 6:14410–14430, 2018.
- Sign bits are all you need for black-box attacks. In ICLR, 2019.
- Square attack: a query-efficient black-box adversarial attack via random search. In ECCV, 2020.
- Donovan Artz. Digital steganography: hiding data within data. IEEE Internet Computing, 5(3):75–80, 2001.
- Synthesizing robust adversarial examples. In ICML, 2018.
- How to backdoor federated learning. In AISTATS, 2020.
- Versatile weight attack via flipping limited bits. arXiv preprint arXiv:2207.12405, 2022.
- Targeted attack against deep neural networks via flipping limited weight bits. In ICLR, 2021.
- Shumeet Baluja. Hiding images in plain sight: Deep steganography. In NeurIPS, 2017.
- A new backdoor attack in cnns by training set corruption without label poisoning. In ICIP, 2019.
- Analyzing federated learning through an adversarial lens. In ICML, 2019.
- Pattern Recognition and Machine Learning. Springer, 2006.
- Architectural backdoors in neural networks. In CVPR, 2023.
- Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends in Machine learning, 3(1):1–122, 2011.
- Decision-based adversarial attacks: Reliable attacks against black-box machine learning models. In ICLR, 2018.
- Improving the transferability of targeted adversarial examples through object-based diverse input. In CVPR, 2022.
- Badprompt: backdoor attacks on continuous prompts. In NeurIPS, 2022.
- Black-box attacks via surrogate ensemble search. In NeurIPS, 2022.
- Towards evaluating the robustness of neural networks. In IEEE S&P, 2017.
- Audio adversarial examples: Targeted attacks on speech-to-text. In IEEE S&P Workshops, 2018.
- Uday K Chakraborty. Advances in differential evolution, volume 143. Springer, 2008.
- Poison attacks against text datasets with conditional adversarially regularized autoencoder. In Findings of EMNLP, 2020.
- Finding optimal least-significant-bit substitution in image hiding by dynamic programming strategy. Pattern Recognition, 36(7):1583–1595, 2003.
- Rowback: Robust watermarking for neural networks using backdoors. In ICMLA, 2021.
- Backdoor attacks on federated meta-learning. In NeurIPS, 2020.
- Proflip: Targeted trojan attack with progressive bit flips. In ICCV, 2021.
- Attacking visual language grounding with adversarial examples: A case study on neural image captioning. In ACL, 2018.
- Rays: A ray searching method for hard-label adversarial attack. In ACM SIGKDD, 2020.
- Hopskipjumpattack: A query-efficient decision-based attack. In IEEE S&P, pages 1277–1294. IEEE, 2020.
- Mag-gan: Massive attack generator via gan. Information Sciences, 536:67–90, 2020.
- The dark side of dynamic routing neural networks: Towards efficiency backdoor injection. In CVPR, 2023.
- Adversarial attack on attackers: Post-process to mitigate black-box score-based query attacks. In NeurIPS, 2022.
- Shapeshifter: Robust physical adversarial attack on faster r-cnn object detector. In ECML PKDD, 2018.
- Trojdiff: Trojan attacks on diffusion models with diverse targets. In CVPR, 2023.
- Effective backdoor defense by exploiting sensitivity of poisoned samples. In NeurIPS, 2022.
- Boosting decision-based black-box adversarial attacks with random sign flip. In ECCV, 2020.
- Targeted backdoor attacks on deep learning systems using data poisoning. arXiv preprint arXiv:1712.05526, 2017.
- Badnl: Backdoor attacks against nlp models with semantic-preserving improvements. In ACSAC, 2021.
- Query-efficient hard-label black-box attack: An optimization-based approach. In ICLR, 2019.
- Improving black-box adversarial attacks with a transfer-based prior. In NeurIPS, 2019.
- Deep feature space trojan attack of neural networks by controlled detoxification. In AAAI, 2021.
- How to backdoor diffusion models? In CVPR, 2023.
- Villandiffusion: A unified backdoor attack framework for diffusion models. In NeurIPS, 2023.
- Robustbench: a standardized adversarial robustness benchmark. In NeurIPS Datasets and Benchmarks Track, 2021.
- Sparse and imperceivable adversarial attacks. In ICCV, 2019.
- Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. In ICML, 2020.
- A unified evaluation of textual backdoor learning: Frameworks and benchmarks. In NeurIPS Datasets and Benchmarks Track, 2022.
- Advfaces: Adversarial face synthesis. In IJCB. IEEE, 2019.
- AdverTorch v0.1: An adversarial robustness toolbox based on pytorch. arXiv preprint arXiv:1902.07623, 2019.
- Privacy-preserving feature extraction via adversarial training. IEEE Transactions on Knowledge and Data Engineering, 34(4):1967–1979, 2020.
- Backdoor attack with imperceptible input and latent modification. In NeurIPS, 2021.
- Lira: Learnable, imperceptible and robust backdoor attacks. In ICCV, 2021.
- Marksman backdoor: Backdoor attacks with arbitrary target class. In NeurIPS, 2022.
- Query-efficient black-box adversarial attacks guided by a transfer-based prior. TPAMI, 44(12):9536–9548, 2022.
- Boosting adversarial attacks with momentum. In CVPR, 2018.
- Evading defenses to transferable adversarial examples by translation-invariant attacks. In CVPR, 2019.
- Efficient decision-based black-box adversarial attacks on face recognition. In CVPR, 2019.
- John R Douceur. The sybil attack. In International workshop on peer-to-peer systems. Springer, 2002.
- Query-efficient meta attack to deep neural networks. In ICLR, 2020.
- Ppt: Backdoor attacks on pre-trained models via poisoned prompt tuning. In IJCAI, 2022.
- Adversarial camouflage: Hiding physical-world attacks with natural styles. In CVPR, 2020.
- Exploring the landscape of spatial robustness. In ICML, 2019.
- Physical adversarial examples for object detectors. In USENIX Conference on Offensive Technologies, 2018.
- Robust physical-world attacks on deep learning visual classification. In CVPR, 2018.
- Sparse adversarial attack via perturbation factorization. In ECCV, 2020.
- Manitest: Are classifiers really invariant? In BMVC, 2015.
- Maximizing non-monotone submodular functions. SIAM Journal on Computing, 40(4):1133–1153, 2011.
- Stealthy backdoor attack with adversarial training. In ICASSP, 2022.
- Meta-attack: Class-agnostic and model-agnostic physical adversarial attack. In ICCV, 2021.
- Fiba: Frequency-injection based backdoor attack in medical image analysis. In CVPR, 2022.
- Boosting black-box attack with partially transferred conditional adversarial distribution. In CVPR, 2022.
- The limitations of federated learning in sybil settings. In 23rd International Symposium on Research in Attacks, Intrusions and Defenses, 2020.
- Imperceptible and robust backdoor attack in 3d point cloud. arXiv preprint arXiv:2208.08052, 2022.
- Backdoor attacks and countermeasures on deep learning: A comprehensive review. arXiv preprint arXiv:2007.10760, 2020.
- Can adversarial weight perturbations inject neural backdoors. In ACM CIKM, 2020.
- Image style transfer using convolutional neural networks. In CVPR, pages 2414–2423, 2016.
- Stealthy attack on algorithmic-protected dnns via smart bit flipping. In ISQED, 2022.
- Defense-resistant backdoor attacks against deep neural networks in outsourced cloud environment. IEEE Journal on Selected Areas in Communications, 2021.
- Generative adversarial nets. In NIPS, 2014.
- Explaining and harnessing adversarial examples. In ICLR, 2015.
- Advbox: a toolbox to generate adversarial examples that fool neural networks, 2020.
- Badnets: Evaluating backdooring attacks on deep neural networks. IEEE Access, 7:47230–47244, 2019.
- Simple black-box adversarial attacks. In ICML, 2019.
- Backpropagating linearly improves transferability of adversarial examples. In NeurIPS, 2020.
- Subspace attack: Exploiting promising subspaces for query-efficient black-box attacks. In NeurIPS, 2019.
- Physical backdoor attacks to lane detection systems in autonomous driving. In ACM Multimedia, 2022.
- Few-shot backdoor attacks via neural tangent kernels. In ICLR, 2022.
- Handcrafted backdoors in deep neural networks. In NeurIPS, 2022.
- Membership inference via backdooring. arXiv preprint arXiv:2206.04823, 2022.
- Backdoor defense via decoupling the training process. In ICLR, 2022.
- Universal physical camouflage attacks on object detectors. In CVPR, 2020.
- Enhancing adversarial example transferability with an intermediate level attack. In ICCV, 2019.
- Black-box adversarial attack with transferable model-based embedding. In ICLR, 2020.
- Black-box adversarial attacks with limited queries and information. In ICML, 2018.
- Prior convictions: Black-box adversarial attacks with bandits and priors. In ICLR, 2019.
- Transferable perturbations of deep feature distributions. In ICLR, 2020.
- Perturbing across the feature hierarchy to improve standard and strict blackbox attack transferability. In NeurIPS, 2020.
- Connecting the digital and physical world: Improving the robustness of adversarial attacks. In AAAI, volume 33, 2019.
- Label poisoning is all you need. In NeurIPS, 2023.
- Las-at: Adversarial training with learnable attack strategy. In CVPR, 2022.
- Color backdoor: A robust poisoning attack in color space. In CVPR, 2023.
- Adversarial image perturbation for privacy protection–a game theory perspective. In Proceedings of the IEEE International Conference on Computer Vision, pages 1482–1491, 2017.
- Geometric robustness of deep networks: analysis and improvement. In CVPR, 2018.
- Lavan: Localized and visible adversarial noise. In ICML, 2018.
- Jacob Devlin Ming-Wei Chang Kenton and Lee Kristina Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of NAACL-HLT, 2019.
- Flipping bits in memory without accessing them: An experimental study of dram disturbance errors. ACM SIGARCH Computer Architecture News, 42(3):361–372, 2014.
- Advhat: Real-world adversarial attack on arcface face id system. In ICPR, 2021.
- Federated learning: Strategies for improving communication efficiency. In NIPS Workshop on Private Multi-Party Machine Learning, 2016.
- Physgan: Generating physical-world-resilient adversarial examples for autonomous driving. In CVPR, 2020.
- Adversarial examples in the physical world. In Artificial intelligence safety and security, pages 99–112. Chapman and Hall/CRC, 2018.
- Functional adversarial attacks. In NeurIPS, 2019.
- Universal adversarial perturbations against object detection. Pattern Recognition, 110:107584, 2021.
- Light can hack your face! black-box backdoor attack on face recognition systems. arXiv preprint arXiv:2009.06996, 2020.
- Qeba: Query-efficient boundary-based blackbox attack. In CVPR, 2020.
- Universal perturbation attack against image retrieval. In ICCV, 2019.
- Projection & probability-driven black-box attack. In CVPR, 2020.
- Backdoor attacks on pre-trained models by layerwise weight poisoning. In EMNLP, 2021.
- Towards transferable targeted attack. In CVPR, 2020.
- Hidden backdoors in human-centric language models. In ACM CCS, 2021.
- Invisible backdoor attacks on deep neural networks via steganography and regularization. TDSC, 2021.
- Compressing convolutional neural networks via factorized convolutional filters. In CVPR, 2019.
- Turning attacks into protection: Social media privacy protection using adversarial attacks. In Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pages 208–216. SIAM, 2021.
- Pointba: Towards backdoor attacks in 3d point cloud. In ICCV, 2021.
- Learning transferable adversarial examples via ghost networks. In AAAI, 2020.
- Untargeted backdoor watermark: Towards harmless and stealthy dataset copyright protection. NeurIPS, 2022.
- Deeprobust: A pytorch library for adversarial attacks and defenses. arXiv preprint arXiv:2005.06149, 2020.
- Nattack: Learning the distributions of adversarial examples for an improved black-box attack on deep neural networks. In ICML, 2019.
- Invisible backdoor attack with sample-specific triggers. In ICCV, 2021.
- Open-sourced dataset protection via backdoor watermarking. In NeurIPS Workshop on Dataset Curation and Security, 2020.
- Black-box dataset ownership verification via backdoor watermarking. IEEE TIFS, 2023.
- A proxy-free strategy for practically improving the poisoning efficiency in backdoor attacks. arXiv preprint arXiv:2306.08313, 2023.
- Explore the effect of data selection on poison efficiency in backdoor attacks. arXiv preprint arXiv:2310.09744, 2023.
- Parallel rectangle flip attack: A query-based black-box attack against object detection. In ICCV, 2021.
- Nesterov accelerated gradient and scale invariance for adversarial attacks. In ICLR, 2020.
- Composite backdoor attack for deep neural network by mixing existing benign features. In ACM CCS, pages 113–131, 2020.
- Deepsec: A uniform platform for security analysis of deep learning model. In 2019 IEEE Symposium on Security and Privacy (SP), pages 673–690. IEEE, 2019.
- Perceptual-sensitive gan for generating adversarial patches. In AAAI, 2019.
- Universal adversarial perturbation via prior driven uncertainty approximation. In ICCV, 2019.
- Greedyfool: Distortion-aware sparse adversarial attack. In NeurIPS, 2020.