Backdoor Attack with Sparse and Invisible Trigger (2306.06209v3)
Abstract: Deep neural networks (DNNs) are vulnerable to backdoor attacks, where the adversary manipulates a small portion of training data such that the victim model predicts normally on the benign samples but classifies the triggered samples as the target class. The backdoor attack is an emerging yet threatening training-phase threat, leading to serious risks in DNN-based applications. In this paper, we revisit the trigger patterns of existing backdoor attacks. We reveal that they are either visible or not sparse and therefore are not stealthy enough. More importantly, it is not feasible to simply combine existing methods to design an effective sparse and invisible backdoor attack. To address this problem, we formulate the trigger generation as a bi-level optimization problem with sparsity and invisibility constraints and propose an effective method to solve it. The proposed method is dubbed sparse and invisible backdoor attack (SIBA). We conduct extensive experiments on benchmark datasets under different settings, which verify the effectiveness of our attack and its resistance to existing backdoor defenses. The codes for reproducing main experiments are available at \url{https://github.com/YinghuaGao/SIBA}.
- Z. Li, D. Gong, Q. Li, D. Tao, and X. Li, “Mutual component analysis for heterogeneous face recognition,” ACM Transactions on Intelligent Systems and Technology, 2016.
- H. Qiu, B. Yu, D. Gong, Z. Li, W. Liu, and D. Tao, “Synface: Face recognition with synthetic data,” in ICCV, 2021.
- M. Luo, H. Wu, H. Huang, W. He, and R. He, “Memory-modulated transformer network for heterogeneous face recognition,” IEEE Transactions on Information Forensics and Security, 2022.
- T. Gu, K. Liu, B. Dolan-Gavitt, and S. Garg, “Badnets: Evaluating backdooring attacks on deep neural networks,” IEEE Access, 2019.
- Y. Wang, E. Sarkar, W. Li, M. Maniatakos, and S. E. Jabari, “Stop-and-go: Exploring backdoor attacks on deep reinforcement learning-based traffic congestion control systems,” IEEE Transactions on Information Forensics and Security, 2021.
- X. Gong, Y. Chen, Q. Wang, H. Huang, L. Meng, C. Shen, and Q. Zhang, “Defense-resistant backdoor attacks against deep neural networks in outsourced cloud environment,” IEEE Journal on Selected Areas in Communications, 2021.
- Y. Li, Y. Jiang, Z. Li, and S.-T. Xia, “Backdoor learning: A survey,” IEEE Transactions on Neural Networks and Learning Systems, 2022.
- K. Grosse, L. Bieringer, T. R. Besold, B. Biggio, and K. Krombholz, “Machine learning security in industry: A quantitative survey,” IEEE Transactions on Information Forensics and Security, 2023.
- X. Gong, Z. Wang, Y. Chen, M. Xue, Q. Wang, and C. Shen, “Kaleidoscope: Physical backdoor attacks against deep neural networks with rgb filters,” IEEE Transactions on Dependable and Secure Computing, 2023.
- R. S. S. Kumar, M. Nyström, J. Lambert, A. Marshall, M. Goertzel, A. Comissoneru, M. Swann, and S. Xia, “Adversarial machine learning-industry perspectives,” in 2020 IEEE Security and Privacy Workshops (SPW), 2020.
- [Online]. Available: https://news.uchicago.edu/story/computer-scientists-design-way-close-backdoors-ai-based-security-systems
- A. Nguyen and A. Tran, “Wanet–imperceptible warping-based backdoor attack,” in ICLR, 2021.
- S. Cheng, Y. Liu, S. Ma, and X. Zhang, “Deep feature space trojan attack of neural networks by controlled detoxification,” in AAAI, 2021.
- Y. Li, H. Zhong, X. Ma, Y. Jiang, and S.-T. Xia, “Few-shot backdoor attacks on visual object tracking,” in ICLR, 2022.
- X. Qi, T. Xie, Y. Li, S. Mahloujifar, and P. Mittal, “Revisiting the assumption of latent separability for backdoor defenses,” in ICLR, 2023.
- M. Barni, K. Kallas, and B. Tondi, “A new backdoor attack in cnns by training set corruption without label poisoning,” in ICIP, 2019.
- S. Zhao, X. Ma, X. Zheng, J. Bailey, J. Chen, and Y.-G. Jiang, “Clean-label backdoor attacks on video recognition models,” in CVPR, 2020.
- Y. Li, Y. Li, B. Wu, L. Li, R. He, and S. Lyu, “Invisible backdoor attack with sample-specific triggers,” in ICCV, 2021.
- X. Chen, C. Liu, B. Li, K. Lu, and D. Song, “Targeted backdoor attacks on deep learning systems using data poisoning,” arXiv preprint arXiv:1712.05526, 2017.
- Y. Li, T. Zhai, Y. Jiang, Z. Li, and S.-T. Xia, “Backdoor attack in the physical world,” in ICLR Workshop, 2021.
- Y. Li, Y. Bai, Y. Jiang, Y. Yang, S.-T. Xia, and B. Li, “Untargeted backdoor watermark: Towards harmless and stealthy dataset copyright protection,” in NeurIPS, 2022.
- C. Luo, Y. Li, Y. Jiang, and S.-T. Xia, “Untargeted backdoor attack against object detection,” in ICASSP, 2023.
- W. Chen, D. Song, and B. Li, “Trojdiff: Trojan attacks on diffusion models with diverse targets,” in CVPR, 2023.
- B. Chen, W. Carvalho, N. Baracaldo, H. Ludwig, B. Edwards, T. Lee, I. Molloy, and B. Srivastava, “Detecting backdoor attacks on deep neural networks by activation clustering,” arXiv preprint arXiv:1811.03728, 2018.
- B. Wang, Y. Yao, S. Shan, H. Li, B. Viswanath, H. Zheng, and B. Y. Zhao, “Neural cleanse: Identifying and mitigating backdoor attacks in neural networks,” in S&P, 2019.
- J. Hayase, W. Kong, R. Somani, and S. Oh, “Spectre: Defending against backdoor attacks using robust statistics,” in ICML, 2021.
- M. Du, R. Jia, and D. Song, “Robust anomaly detection and backdoor attack detection via differential privacy,” in ICLR, 2020.
- Y. Li, X. Lyu, N. Koren, L. Lyu, B. Li, and X. Ma, “Anti-backdoor learning: Training clean models on poisoned data,” in NeurIPS, 2021.
- K. Huang, Y. Li, B. Wu, Z. Qin, and K. Ren, “Backdoor defense via decoupling the training process,” in ICLR, 2022.
- K. Liu, B. Dolan-Gavitt, and S. Garg, “Fine-pruning: Defending against backdooring attacks on deep neural networks,” in RAID, 2018.
- D. Wu and Y. Wang, “Adversarial neuron pruning purifies backdoored deep models,” in NeurIPS, 2021.
- Y. Zeng, S. Chen, W. Park, Z. M. Mao, M. Jin, and R. Jia, “Adversarial unlearning of backdoors via implicit hypergradient,” in ICLR, 2022.
- E. Chou, F. Tramer, and G. Pellegrino, “Sentinet: Detecting localized universal attacks against deep learning systems,” in IEEE S&P Workshop, 2020.
- Y. Gao, Y. Kim, B. G. Doan, Z. Zhang, G. Zhang, S. Nepal, D. C. Ranasinghe, and H. Kim, “Design and evaluation of a multi-domain trojan detection method on deep neural networks,” IEEE Transactions on Dependable and Secure Computing, 2021.
- J. Guo, Y. Li, X. Chen, H. Guo, L. Sun, and C. Liu, “Scale-up: An efficient black-box input-level backdoor detection via analyzing scaled prediction consistency,” in ICLR, 2023.
- Y. Fan, B. Wu, T. Li, Y. Zhang, M. Li, Z. Li, and Y. Yang, “Sparse adversarial attack via perturbation factorization,” in ECCV, 2020.
- M. Zhu, T. Chen, and Z. Wang, “Sparse and imperceptible adversarial attack via a homotopy algorithm,” in ICML, 2021.
- X. Zhou, Y. Lin, W. Zhang, and T. Zhang, “Sparse invariant risk minimization,” in ICML, 2022.
- B. Li, Y. Shen, J. Yang, Y. Wang, J. Ren, T. Che, J. Zhang, and Z. Liu, “Sparse mixture-of-experts are domain generalizable learners,” arXiv preprint arXiv:2206.04046, 2022.
- C. Févotte and S. J. Godsill, “Sparse linear regression in unions of bases via bayesian variable selection,” SPL, 2006.
- E. J. Candès, J. Romberg, and T. Tao, “Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information,” IEEE Transactions on Information Theory, 2006.
- D. L. Donoho, M. Vetterli, R. A. DeVore, and I. Daubechies, “Data compression and harmonic analysis,” IEEE Transactions on Information Theory, 1998.
- T. Li, B. Wu, Y. Yang, Y. Fan, Y. Zhang, and W. Liu, “Compressing convolutional neural networks via factorized convolutional filters,” in CVPR, 2019.
- H. Yang, S. Gui, Y. Zhu, and J. Liu, “Automatic neural network compression by sparsity-quantization joint learning: A constrained optimization-based approach,” in CVPR, 2020.
- F. Croce and M. Hein, “Sparse and imperceivable adversarial attacks,” in ICCV, 2019.
- E. J. Candes and T. Tao, “Decoding by linear programming,” IEEE Transactions on Information Theory, 2005.
- Z. Xu, X. Chang, F. Xu, and H. Zhang, “l_𝑙_l\_italic_l _{1/2121/21 / 2} regularization: A thresholding representation theory and a fast solver,” IEEE Transactions on Neural Networks and Learning Systems, 2012.
- T. Zhang, “Analysis of multi-stage convex relaxation for sparse regularization.” Journal of Machine Learning Research, 2010.
- A. Beck and Y. C. Eldar, “Sparsity constrained nonlinear optimization: Optimality conditions and algorithms,” SIAM Journal on Optimization, 2013.
- P. Jain, A. Tewari, and P. Kar, “On iterative hard thresholding methods for high-dimensional m-estimation,” in NeurIPS, 2014.
- Z. Lu, “Iterative hard thresholding methods for l 0 regularized convex cone programming,” Mathematical Programming, 2014.
- X. Dong, D. Chen, J. Bao, C. Qin, L. Yuan, W. Zhang, N. Yu, and D. Chen, “Greedyfool: Distortion-aware sparse adversarial attack,” in NeurIPS, 2020.
- C. Xie, J. Wang, Z. Zhang, Y. Zhou, L. Xie, and A. Yuille, “Adversarial examples for semantic segmentation and object detection,” in ICCV, 2017.
- Y. Dong, H. Su, B. Wu, Z. Li, W. Liu, T. Zhang, and J. Zhu, “Efficient decision-based black-box adversarial attacks on face recognition,” in CVPR, 2019.
- A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu, “Towards deep learning models resistant to adversarial attacks,” arXiv preprint arXiv:1706.06083, 2017.
- B. G. Doan, E. Abbasnejad, and D. C. Ranasinghe, “Februus: Input purification defense against trojan attacks on deep neural network systems,” in ACSAC, 2020.
- J. Dumford and W. Scheirer, “Backdooring convolutional neural networks via targeted weight perturbations,” in IJCB, 2020.
- E. Wenger, J. Passananti, A. N. Bhagoji, Y. Yao, H. Zheng, and B. Y. Zhao, “Backdoor attacks against deep learning systems in the physical world,” in CVPR, 2021.
- K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in CVPR, 2016.
- K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv:1409.1556, 2014.
- R. Zhang, P. Isola, A. A. Efros, E. Shechtman, and O. Wang, “The unreasonable effectiveness of deep features as a perceptual metric,” in CVPR, 2018.
- K. D. Doan, Y. Lao, and P. Li, “Marksman backdoor: Backdoor attacks with arbitrary target class,” NeurIPS, 2022.
- T. B. Brown, D. Mané, A. Roy, M. Abadi, and J. Gilmer, “Adversarial patch,” arXiv preprint arXiv:1712.09665, 2017.
- A. Liu, J. Wang, X. Liu, B. Cao, C. Zhang, and H. Yu, “Bias-based universal adversarial patch attack for automatic check-out,” in ECCV, 2020.
- Yinghua Gao (2 papers)
- Yiming Li (199 papers)
- Xueluan Gong (12 papers)
- Zhifeng Li (74 papers)
- Shu-Tao Xia (171 papers)
- Qian Wang (453 papers)