Backdoor Contrastive Learning via Bi-level Trigger Optimization (2404.07863v1)
Abstract: Contrastive Learning (CL) has attracted enormous attention due to its remarkable capability in unsupervised representation learning. However, recent works have revealed the vulnerability of CL to backdoor attacks: the feature extractor could be misled to embed backdoored data close to an attack target class, thus fooling the downstream predictor to misclassify it as the target. Existing attacks usually adopt a fixed trigger pattern and poison the training set with trigger-injected data, hoping for the feature extractor to learn the association between trigger and target class. However, we find that such fixed trigger design fails to effectively associate trigger-injected data with target class in the embedding space due to special CL mechanisms, leading to a limited attack success rate (ASR). This phenomenon motivates us to find a better backdoor trigger design tailored for CL framework. In this paper, we propose a bi-level optimization approach to achieve this goal, where the inner optimization simulates the CL dynamics of a surrogate victim, and the outer optimization enforces the backdoor trigger to stay close to the target throughout the surrogate CL procedure. Extensive experiments show that our attack can achieve a higher attack success rate (e.g., $99\%$ ASR on ImageNet-100) with a very low poisoning rate ($1\%$). Besides, our attack can effectively evade existing state-of-the-art defenses. Code is available at: https://github.com/SWY666/SSL-backdoor-BLTO.
- A cookbook of self-supervised learning. arXiv preprint arXiv:2304.12210, 2023.
- Poisoning and backdooring contrastive learning, 2022.
- A simple framework for contrastive learning of visual representations. In International conference on machine learning, pp. 1597–1607. PMLR, 2020.
- Exploring simple siamese representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 15750–15758, 2021.
- Targeted backdoor attacks on deep learning systems using data poisoning. arXiv preprint arXiv:1712.05526, 2017.
- Imagenet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255, 2009. doi: 10.1109/CVPR.2009.5206848.
- An image is worth 16x16 words: Transformers for image recognition at scale, 2021.
- Self-supervised representation learning: Introduction, advances, and challenges. IEEE Signal Processing Magazine, 39(3):42–62, 2022. doi: 10.1109/MSP.2021.3134634.
- Detecting backdoors in pre-trained encoders. In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 16352–16362, 2023. doi: 10.1109/CVPR52729.2023.01569.
- Bootstrap your own latent-a new approach to self-supervised learning. Advances in neural information processing systems, 33:21271–21284, 2020.
- Badnets: Identifying vulnerabilities in the machine learning model supply chain. arXiv preprint arXiv:1708.06733, 2017.
- Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27-30, 2016, pp. 770–778. IEEE Computer Society, 2016. doi: 10.1109/CVPR.2016.90.
- Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 9729–9738, 2020.
- Squeezenet: Alexnet-level accuracy with 50x fewer parameters and¡ 0.5 mb model size. arXiv preprint arXiv:1602.07360, 2016.
- Badencoder: Backdoor attacks to pre-trained encoders in self-supervised learning. In 2022 IEEE Symposium on Security and Privacy (SP), May 2022. doi: 10.1109/sp46214.2022.9833644.
- Alex Krizhevsky. Learning multiple layers of features from tiny images. Jan 2009.
- An embarrassingly simple backdoor attack on self-supervised learning. arXiv preprint arXiv:2210.07346, 2023.
- Backdoor learning: A survey. IEEE Transactions on Neural Networks and Learning Systems, 2022.
- Invisible backdoor attack with sample-specific triggers. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Oct 2021. doi: 10.1109/iccv48922.2021.01615.
- PoisonedEncoder: Poisoning the unlabeled pre-training data in contrastive learning. In 31st USENIX Security Symposium (USENIX Security 22), pp. 3629–3645, Boston, MA, August 2022. USENIX Association. ISBN 978-1-939133-31-1. URL https://www.usenix.org/conference/usenixsecurity22/presentation/liu-hongbin.
- ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design, pp. 122–138. Jan 2018. doi: 10.1007/978-3-030-01264-9˙8.
- Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748, 2018.
- Asset: Robust backdoor data detection across a multiplicity of deep learning paradigms. In Proceedings of the 32nd USENIX Conference on Security Symposium, SEC ’23, USA, 2023. USENIX Association. ISBN 978-1-939133-37-3.
- A survey on transfer learning. IEEE Transactions on knowledge and data engineering, 22(10):1345–1359, 2009.
- Hidden trigger backdoor attacks. CoRR, abs/1910.00033, 2019.
- Backdoor attacks on self-supervised learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13337–13346, 2022.
- Mobilenetv2: Inverted residuals and linear bottlenecks. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun 2018. doi: 10.1109/cvpr.2018.00474.
- Sleeper agent: Scalable hidden trigger backdoors for neural networks trained from scratch. In S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh (eds.), Advances in Neural Information Processing Systems, volume 35, pp. 19165–19178. Curran Associates, Inc., 2022. URL https://proceedings.neurips.cc/paper_files/paper/2022/file/79eec295a3cd5785e18c61383e7c996b-Paper-Conference.pdf.
- Label-consistent backdoor attacks. Cornell University - arXiv,Cornell University - arXiv, Dec 2019.
- Laurens Van der Maaten and Geoffrey Hinton. Visualizing data using t-sne. Journal of machine learning research, 9(11), 2008.
- Neural cleanse: Identifying and mitigating backdoor attacks in neural networks. In 2019 IEEE Symposium on Security and Privacy (SP), pp. 707–723. IEEE, 2019.
- Understanding the behaviour of contrastive loss. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Jun 2021. doi: 10.1109/cvpr46437.2021.00252.
- Understanding contrastive representation learning through alignment and uniformity on the hypersphere. In International Conference on Machine Learning, pp. 9929–9939. PMLR, 2020.
- Unsupervised feature learning via non-parametric instance discrimination. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun 2018. doi: 10.1109/cvpr.2018.00393. URL http://dx.doi.org/10.1109/cvpr.2018.00393.
- Regnet: Self-regulated network for image classification. IEEE Transactions on Neural Networks and Learning Systems, 34(11):9562–9567, 2023. doi: 10.1109/TNNLS.2022.3158966.
- Invisible backdoor attacks using data poisoning in the frequency domain. arXiv preprint arXiv:2207.04209, 2022.
- Adversarial unlearning of backdoors via implicit hypergradient. In International Conference on Learning Representations, 2021.
- Narcissus: A practical clean-label backdoor attack with limited information, 2022.
- Corruptencoder: Data poisoning based backdoor attacks to contrastive learning. arXiv preprint arXiv:2211.08229, 2022.
- Ssl-cleanse: Trojan detection and mitigation in self-supervised learning. CoRR, abs/2303.09079, 2023. doi: 10.48550/arXiv.2303.09079.
- Data-free backdoor removal based on channel lipschitzness. In Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part V, pp. 175–191. Springer, 2022.
- Boosting backdoor attack with a learnable poisoning sample selection strategy, 2023.
- Weiyu Sun (7 papers)
- Xinyu Zhang (296 papers)
- Hao Lu (99 papers)
- Yingcong Chen (35 papers)
- Ting Wang (213 papers)
- Jinghui Chen (50 papers)
- Lu Lin (54 papers)