Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Erasing Self-Supervised Learning Backdoor by Cluster Activation Masking (2312.07955v2)

Published 13 Dec 2023 in cs.CV, cs.AI, cs.CR, and cs.LG

Abstract: Self-Supervised Learning (SSL) is an effective paradigm for learning representations from unlabeled data, such as text, images, and videos. However, researchers have recently found that SSL is vulnerable to backdoor attacks. The attacker can embed hidden SSL backdoors via a few poisoned examples in the training dataset and maliciously manipulate the behavior of downstream models. To defend against SSL backdoor attacks, a feasible route is to detect and remove the poisonous samples in the training set. However, the existing SSL backdoor defense method fails to detect the poisonous samples precisely. In this paper, we propose to erase the SSL backdoor by cluster activation masking and propose a novel PoisonCAM method. After obtaining the threat model trained on the poisoned dataset, our method can precisely detect poisonous samples based on the assumption that masking the backdoor trigger can effectively change the activation of a downstream clustering model. In experiments, our PoisonCAM achieves 96\% accuracy for backdoor trigger detection compared to 3\% of the state-of-the-art method on poisoned ImageNet-100. Moreover, our proposed PoisonCAM significantly improves the performance of the trained SSL model under backdoor attacks compared to the state-of-the-art method. Our code, data, and trained models will be open once this paper is accepted.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (60)
  1. Cleanclip: Mitigating data poisoning attacks in multimodal contrastive learning. In ICLR 2023 Workshop on Trustworthy and Reliable Large-Scale Machine Learning Models, 2023.
  2. Poisoning and backdooring contrastive learning. In International Conference on Learning Representations, 2021.
  3. Unsupervised learning of visual features by contrasting cluster assignments. In Proceedings of the 34th International Conference on Neural Information Processing Systems, Red Hook, NY, USA, 2020. Curran Associates Inc.
  4. Grad-cam++: Generalized gradient-based visual explanations for deep convolutional networks. In 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), pages 839–847, 2018.
  5. A simple framework for contrastive learning of visual representations. In International conference on machine learning, pages 1597–1607. PMLR, 2020.
  6. Exploring simple siamese representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 15750–15758, 2021.
  7. Improved Baselines with Momentum Contrastive Learning. arXiv e-prints, art. arXiv:2003.04297, 2020.
  8. An empirical study of training self-supervised vision transformers. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 9640–9649, 2021.
  9. Boundary iou: Improving object-centric image segmentation evaluation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 15334–15342, 2021.
  10. Debiased contrastive learning. Advances in neural information processing systems, 33:8765–8775, 2020.
  11. An analysis of single-layer networks in unsupervised feature learning. international conference on artificial intelligence and statistics, 2011.
  12. Discriminative unsupervised feature learning with convolutional neural networks. Advances in neural information processing systems, 27, 2014.
  13. An image is worth 16x16 words: Transformers for image recognition at scale. ICLR, 2021.
  14. Unsupervised representation learning by predicting image rotations. In International Conference on Learning Representations, 2018.
  15. Bootstrap your own latent-a new approach to self-supervised learning. Advances in neural information processing systems, 33:21271–21284, 2020.
  16. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
  17. Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9729–9738, 2020.
  18. Masked autoencoders are scalable vision learners. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 16000–16009, 2022.
  19. Backdoor defense via decoupling the training process. In International Conference on Learning Representations, 2021.
  20. A survey on contrastive self-supervised learning. Technologies, 9(1):2, 2020.
  21. Badencoder: Backdoor attacks to pre-trained encoders in self-supervised learning. In 2022 IEEE Symposium on Security and Privacy (SP), pages 2043–2059. IEEE, 2022.
  22. Layercam: Exploring hierarchical class activation maps for localization. IEEE Transactions on Image Processing, 30:5875–5888, 2021.
  23. Jacob Devlin Ming-Wei Chang Kenton and Lee Kristina Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of NAACL-HLT, pages 4171–4186, 2019.
  24. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114, 2013.
  25. Mean shift for self-supervised learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 10326–10335, 2021.
  26. Self-supervised learning in medicine and healthcare. Nature Biomedical Engineering, 6(12):1346–1352, 2022.
  27. An embarrassingly simple backdoor attack on self-supervised learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 4367–4378, 2023.
  28. Invisible backdoor attacks on deep neural networks via steganography and regularization. IEEE Transactions on Dependable and Secure Computing, 18(5):2088–2105, 2020.
  29. Beating backdoor attack at its own game. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 4620–4629, 2023.
  30. Self-supervised learning: Generative or contrastive. IEEE transactions on knowledge and data engineering, 35(1):857–876, 2021.
  31. James B McQueen. Some methods of classification and analysis of multivariate observations. In Proc. of 5th Berkeley Symposium on Math. Stat. and Prob., pages 281–297, 1967.
  32. Towards stable backdoor purification through feature shift tuning. In Thirty-seventh Conference on Neural Information Processing Systems, 2023.
  33. Ishan Misra and Laurens van der Maaten. Self-supervised learning of pretext-invariant representations. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 6707–6717, 2020.
  34. Progressive backdoor erasing via connecting backdoor and adversarial attacks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 20495–20503, 2023.
  35. Unsupervised learning of visual representions by solving jigsaw puzzles. In ECCV, 2016.
  36. Backdoor cleansing with unlabeled data. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12218–12227, 2023.
  37. Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, 32, 2019.
  38. Towards a proactive {{\{{ML}}\}} approach for detecting backdoor poison samples. In 32nd USENIX Security Symposium (USENIX Security 23), pages 1685–1702, 2023.
  39. Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763. PMLR, 2021.
  40. Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems, 28, 2015.
  41. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV), 115(3):211–252, 2015.
  42. Hidden trigger backdoor attacks. In Proceedings of the AAAI conference on artificial intelligence, pages 11957–11965, 2020.
  43. Backdoor attacks on self-supervised learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13337–13346, 2022.
  44. Self-supervised learning for videos: A survey. ACM Computing Surveys, 55(13s):1–37, 2023.
  45. Grad-cam: Visual explanations from deep networks via gradient-based localization. In 2017 IEEE International Conference on Computer Vision (ICCV), pages 618–626, 2017.
  46. Max-margin contrastive learning. In Proceedings of the AAAI Conference on Artificial Intelligence, pages 8220–8230, 2022.
  47. Poisoned classifiers are not only backdoored, they are fundamentally broken. arXiv preprint arXiv:2010.09080, 2020.
  48. Distribution preserving backdoor attack in self-supervised learning. In 2024 IEEE Symposium on Security and Privacy (SP), pages 29–29. IEEE Computer Society, 2023.
  49. Defending against patch-based backdoor attacks on self-supervised learning. In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 12239–12249, 2023.
  50. Contrastive multiview coding. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XI 16, pages 776–794. Springer, 2020.
  51. Score-cam: Score-weighted visual explanations for convolutional neural networks. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pages 111–119, 2020.
  52. Unsupervised feature learning via non-parametric instance discrimination. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3733–3742, 2018.
  53. Medic: Remove model backdoors via importance driven cloning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 20485–20494, 2023.
  54. Xlnet: Generalized autoregressive pretraining for language understanding. Advances in neural information processing systems, 32, 2019.
  55. Latent backdoor attacks on deep neural networks. In Proceedings of the 2019 ACM SIGSAC conference on computer and communications security, pages 2041–2055, 2019.
  56. Graph contrastive learning with augmentations. Advances in neural information processing systems, 33:5812–5823, 2020.
  57. Colorful image colorization. In ECCV, 2016.
  58. Backdoor defense via deconfounded representation learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12228–12238, 2023.
  59. Gangsweep: Sweep out neural backdoors by gan. In Proceedings of the 28th ACM International Conference on Multimedia, pages 3173–3181, 2020.
  60. Neural polarizer: A lightweight and effective backdoor defense via purifying poisoned features. In Thirty-seventh Conference on Neural Information Processing Systems, 2023.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Shengsheng Qian (13 papers)
  2. Yifei Wang (141 papers)
  3. Dizhan Xue (6 papers)
  4. Shengjie Zhang (9 papers)
  5. Huaiwen Zhang (9 papers)
  6. Changsheng Xu (101 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.