Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

EmInspector: Combating Backdoor Attacks in Federated Self-Supervised Learning Through Embedding Inspection (2405.13080v1)

Published 21 May 2024 in cs.CR and cs.LG

Abstract: Federated self-supervised learning (FSSL) has recently emerged as a promising paradigm that enables the exploitation of clients' vast amounts of unlabeled data while preserving data privacy. While FSSL offers advantages, its susceptibility to backdoor attacks, a concern identified in traditional federated supervised learning (FSL), has not been investigated. To fill the research gap, we undertake a comprehensive investigation into a backdoor attack paradigm, where unscrupulous clients conspire to manipulate the global model, revealing the vulnerability of FSSL to such attacks. In FSL, backdoor attacks typically build a direct association between the backdoor trigger and the target label. In contrast, in FSSL, backdoor attacks aim to alter the global model's representation for images containing the attacker's specified trigger pattern in favor of the attacker's intended target class, which is less straightforward. In this sense, we demonstrate that existing defenses are insufficient to mitigate the investigated backdoor attacks in FSSL, thus finding an effective defense mechanism is urgent. To tackle this issue, we dive into the fundamental mechanism of backdoor attacks on FSSL, proposing the Embedding Inspector (EmInspector) that detects malicious clients by inspecting the embedding space of local models. In particular, EmInspector assesses the similarity of embeddings from different local models using a small set of inspection images (e.g., ten images of CIFAR100) without specific requirements on sample distribution or labels. We discover that embeddings from backdoored models tend to cluster together in the embedding space for a given inspection image. Evaluation results show that EmInspector can effectively mitigate backdoor attacks on FSSL across various adversary settings. Our code is avaliable at https://github.com/ShuchiWu/EmInspector.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (64)
  1. C. Jia, Y. Yang, Y. Xia, Y.-T. Chen, Z. Parekh, H. Pham, Q. Le, Y.-H. Sung, Z. Li, and T. Duerig, “Scaling up visual and vision-language representation learning with noisy text supervision,” in International Conference on Machine Learning, pp. 4904–4916, PMLR, 2021.
  2. S. Feng, G. Tao, S. Cheng, G. Shen, X. Xu, Y. Liu, K. Zhang, S. Ma, and X. Zhang, “Detecting backdoors in pre-trained encoders,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16352–16362, 2023.
  3. A. Saha, A. Tejankar, S. A. Koohpayegani, and H. Pirsiavash, “Backdoor attacks on self-supervised learning,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13337–13346, 2022.
  4. J. Li, L. Lyu, D. Iso, C. Chakrabarti, and M. Spranger, “MocoSFL: enabling cross-client collaborative self-supervised learning,” in Proc.The Eleventh International Conference on Learning Representations (ICLR), 2023.
  5. V. Tsouvalas, A. Saeed, and T. Ozcelebi, “Federated self-training for semi-supervised audio recognition,” ACM Transactions on Embedded Computing Systems, vol. 21, no. 6, pp. 1–26, 2022.
  6. W. Zhuang, Y. Wen, and S. Zhang, “Divergence-aware federated self-supervised learning,” in International Conference on Learning Representations, 2022.
  7. J. Jia, Y. Liu, and N. Z. Gong, “Badencoder: Backdoor attacks to pre-trained encoders in self-supervised learning,” in 2022 IEEE Symposium on Security and Privacy (SP), pp. 2043–2059, IEEE, 2022.
  8. L. Haoyang, Q. Ye, H. Hu, J. Li, L. Wang, C. Fang, and J. Shi, “3dfed: Adaptive and extensible framework for covert backdoor attack in federated learning,” in 2023 IEEE Symposium on Security and Privacy (SP), pp. 1893–1907, IEEE Computer Society, 2023.
  9. Z. Zhang, X. Cao, J. Jia, and N. Z. Gong, “FLDetector: Defending federated learning against model poisoning attacks via detecting malicious clients,” in Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 2545–2555, 2022.
  10. K. Singhal, H. Sidahmed, Z. Garrett, S. Wu, J. Rush, and S. Prakash, “Federated reconstruction: Partially local federated learning,” in Proc. Advances in Neural Information Processing Systems (M. Ranzato, A. Beygelzimer, Y. Dauphin, P. Liang, and J. W. Vaughan, eds.), vol. 34, pp. 11220–11232, 2021.
  11. F. Fu, H. Xue, Y. Cheng, Y. Tao, and B. Cui, “BlindFL: Vertical federated machine learning without peeking into your data,” in Proc. International Conference on Management of Data, p. 1316–1330, 2022.
  12. Y. A. U. Rehman, Y. Gao, J. Shen, P. P. B. de Gusmão, and N. Lane, “Federated self-supervised learning for video understanding,” in Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXXI, pp. 506–522, Springer, 2022.
  13. Y. Liu, A. Huang, Y. Luo, H. Huang, Y. Liu, Y. Chen, L. Feng, T. Chen, H. Yu, and Q. Yang, “FedVision: An online visual object detection platform powered by federated learning,” Proc. AAAI Conference on Artificial Intelligence, vol. 34, pp. 13172–13179, Apr. 2020.
  14. Y. Wu, D. Zeng, Z. Wang, Y. Sheng, L. Yang, A. J. James, Y. Shi, and J. Hu, “Federated self-supervised contrastive learning and masked autoencoder for dermatological disease diagnosis,” arXiv preprint arXiv:2208.11278, 2022.
  15. K. He, H. Fan, Y. Wu, S. Xie, and R. Girshick, “Momentum contrast for unsupervised visual representation learning,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 9729–9738, 2020.
  16. J.-B. Grill, F. Strub, F. Altché, C. Tallec, P. Richemond, E. Buchatskaya, C. Doersch, B. Avila Pires, Z. Guo, M. Gheshlaghi Azar, et al., “Bootstrap your own latent-a new approach to self-supervised learning,” Advances in neural information processing systems, vol. 33, pp. 21271–21284, 2020.
  17. A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clark, et al., “Learning transferable visual models from natural language supervision,” in International conference on machine learning, pp. 8748–8763, PMLR, 2021.
  18. J. Stallkamp, M. Schlipsing, J. Salmen, and C. Igel, “Man vs. computer: Benchmarking machine learning algorithms for traffic sign recognition,” Neural networks, vol. 32, pp. 323–332, 2012.
  19. B. Zhao, P. Sun, T. Wang, and K. Jiang, “FedInv: Byzantine-robust federated learning by inversing local model updates,” in Proceedings of the AAAI Conference on Artificial Intelligence, pp. 9171–9179, Jun. 2022.
  20. V. Shejwalkar and A. Houmansadr, “Manipulating the byzantine: Optimizing model poisoning attacks and defenses for federated learning,” in NDSS, 2021.
  21. K. He, X. Zhang, S. Ren, and J. Sun, “Identity mappings in deep residual networks,” in Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part IV 14, pp. 630–645, Springer, 2016.
  22. K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv:1409.1556, 2014.
  23. P. Rieger, T. D. Nguyen, M. Miettinen, and A.-R. Sadeghi, “Deepsight: Mitigating backdoor attacks in federated learning through deep model inspection,” arXiv preprint arXiv:2201.00763, 2022.
  24. E. Bagdasaryan, A. Veit, Y. Hua, D. Estrin, and V. Shmatikov, “How to backdoor federated learning,” in Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, vol. 108 of Proceedings of Machine Learning Research, pp. 2938–2948, PMLR, Aug. 2020.
  25. P. Blanchard, E. M. El Mhamdi, R. Guerraoui, and J. Stainer, “Machine learning with adversaries: Byzantine tolerant gradient descent,” Advances in neural information processing systems, vol. 30, 2017.
  26. N. Wang, Y. Xiao, Y. Chen, Y. Hu, W. Lou, and Y. T. Hou, “Flare: defending federated learning against model poisoning attacks via latent space representations,” in Proceedings of the 2022 ACM on Asia Conference on Computer and Communications Security, pp. 946–958, 2022.
  27. Y. Wang, D. Zhai, Y. Zhan, and Y. Xia, “Rflbat: A robust federated learning algorithm against backdoor attack,” arXiv preprint arXiv:2201.03772, 2022.
  28. X. Cao, M. Fang, J. Liu, and N. Z. Gong, “FLTrust: Byzantine-robust federated learning via trust bootstrapping,” in 28th Annual Network and Distributed System Security Symposium, Feb. 2021.
  29. C. Xie, M. Chen, P.-Y. Chen, and B. Li, “CRFL: Certifiably robust federated learning against backdoor attacks,” in Proceedings of the 38th International Conference on Machine Learning, pp. 11372–11382, Jul. 2021.
  30. Z. Xiang, D. J. Miller, and G. Kesidis, “Detection of backdoors in trained classifiers without access to the training set,” IEEE Transactions on Neural Networks and Learning Systems, vol. 33, no. 3, pp. 1177–1191, 2020.
  31. T. D. Nguyen, P. Rieger, R. De Viti, H. Chen, B. B. Brandenburg, H. Yalame, H. Möllering, H. Fereidooni, S. Marchal, M. Miettinen, et al., “{{\{{FLAME}}\}}: Taming backdoors in federated learning,” in 31st USENIX Security Symposium (USENIX Security 22), pp. 1415–1432, 2022.
  32. X. Cao, J. Jia, and N. Z. Gong, “Provably secure federated learning against malicious clients,” in Proceedings of the AAAI conference on artificial intelligence, vol. 35, pp. 6885–6893, 2021.
  33. C. Fung, C. J. M. Yoon, and I. Beschastnikh, “The Limitations of Federated Learning in Sybil Settings,” in Symposium on Research in Attacks, Intrusion, and Defenses, RAID, 2020.
  34. T. Chen, S. Kornblith, M. Norouzi, and G. Hinton, “A simple framework for contrastive learning of visual representations,” in International conference on machine learning, pp. 1597–1607, PMLR, 2020.
  35. S. Yang, Y. Li, Y. Jiang, and S.-T. Xia, “Backdoor defense via suppressing model shortcuts,” in ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5, IEEE, 2023.
  36. S. Andreina, G. A. Marson, H. Möllering, and G. Karame, “Baffle: Backdoor detection via feedback-based federated learning,” in 2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS), pp. 852–863, IEEE, 2021.
  37. C. Fung, C. J. Yoon, and I. Beschastnikh, “The limitations of federated learning in sybil settings,” in 23rd International Symposium on Research in Attacks, Intrusions and Defenses (RAID 2020), pp. 301–316, 2020.
  38. Q. Li, Y. Diao, Q. Chen, and B. He, “Federated learning on non-iid data silos: An experimental study,” in 2022 IEEE 38th International Conference on Data Engineering (ICDE), pp. 965–978, IEEE, 2022.
  39. C. Xie, K. Huang, P.-Y. Chen, and B. Li, “DBA: Distributed backdoor attacks against federated learning,” in International Conference on Learning Representations, 2020.
  40. Z. Wu, Y. Xiong, S. X. Yu, and D. Lin, “Unsupervised feature learning via non-parametric instance discrimination,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3733–3742, 2018.
  41. X. Chen and K. He, “Exploring simple siamese representation learning,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 15750–15758, 2021.
  42. D. Yin, Y. Chen, R. Kannan, and P. Bartlett, “Byzantine-robust distributed learning: Towards optimal statistical rates,” in Proceedings of the 35th International Conference on Machine Learning, pp. 5650–5659, Jul. 2018.
  43. P. Blanchard, E. M. E. Mhamdi, R. Guerraoui, and J. Stainer, “Machine learning with adversaries: Byzantine tolerant gradient descent,” in Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems, pp. 119–129, Dec. 2017.
  44. B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y Arcas, “Communication-efficient learning of deep networks from decentralized data,” in Artificial intelligence and statistics, pp. 1273–1282, PMLR, 2017.
  45. K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778, 2016.
  46. A. Coates, A. Ng, and H. Lee, “An analysis of single-layer networks in unsupervised feature learning,” in Proceedings of the fourteenth international conference on artificial intelligence and statistics, pp. 215–223, JMLR Workshop and Conference Proceedings, 2011.
  47. J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “Imagenet: A large-scale hierarchical image database,” in 2009 IEEE conference on computer vision and pattern recognition, pp. 248–255, Ieee, 2009.
  48. L. Van der Maaten and G. Hinton, “Visualizing data using t-sne.,” Journal of machine learning research, vol. 9, no. 11, 2008.
  49. X. Liu, F. Zhang, Z. Hou, L. Mian, Z. Wang, J. Zhang, and J. Tang, “Self-supervised learning: Generative or contrastive,” IEEE Transactions on Knowledge and Data Engineering, vol. 35, no. 1, pp. 857–876, 2021.
  50. P. Sermanet, C. Lynch, Y. Chebotar, J. Hsu, E. Jang, S. Schaal, S. Levine, and G. Brain, “Time-contrastive networks: Self-supervised learning from video,” in 2018 IEEE international conference on robotics and automation (ICRA), pp. 1134–1141, IEEE, 2018.
  51. D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.
  52. X. Gong, Y. Chen, H. Huang, Y. Liao, S. Wang, and Q. Wang, “Coordinated backdoor attacks against federated learning with model-dependent triggers,” IEEE network, vol. 36, no. 1, pp. 84–90, 2022.
  53. K. Pillutla, S. M. Kakade, and Z. Harchaoui, “Robust aggregation for federated learning,” IEEE Transactions on Signal Processing, vol. 70, pp. 1142–1154, 2022.
  54. J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” arXiv preprint arXiv:1810.04805, 2018.
  55. A. Krizhevsky, G. Hinton, et al., “Learning multiple layers of features from tiny images,” 2009.
  56. H. Wang, K. Sreenivasan, S. Rajput, H. Vishwakarma, S. Agarwal, J.-y. Sohn, K. Lee, and D. Papailiopoulos, “Attack of the tails: Yes, you really can backdoor federated learning,” Advances in Neural Information Processing Systems, vol. 33, pp. 16070–16084, 2020.
  57. Z. Sun, P. Kairouz, A. T. Suresh, and H. B. McMahan, “Can you really backdoor federated learning?,” arXiv preprint arXiv:1911.07963, 2019.
  58. Y. Kim, H. Chen, and F. Koushanfar, “Backdoor defense in federated learning using differential testing and outlier detection,” arXiv preprint arXiv:2202.11196, 2022.
  59. C. Li, R. Pang, Z. Xi, T. Du, S. Ji, Y. Yao, and T. Wang, “Demystifying self-supervised trojan attacks,” arXiv preprint arXiv:2210.07346, 2022.
  60. Y. Liu, S. Guo, J. Zhang, Q. Zhou, Y. Wang, and X. Zhao, “Feature correlation-guided knowledge transfer for federated self-supervised learning,” Arxiv, 2022.
  61. J. G. Disha Makhija, Nhat Ho, “Federated Self-supervised Learning for Heterogeneous Clients,” Arxiv, 2022.
  62. K. Zhang, G. Tao, Q. Xu, S. Cheng, S. An, Y. Liu, S. Feng, G. Shen, P.-Y. Chen, S. Ma, et al., “Flip: A provable defense framework for backdoor mitigation in federated learning,” arXiv preprint arXiv:2210.12873, 2022.
  63. K. Huang, Y. Li, B. Wu, Z. Qin, and K. Ren, “Backdoor defense via decoupling the training process,” arXiv preprint arXiv:2202.03423, 2022.
  64. A. Krizhevsky, V. Nair, and G. Hinton, “Cifar-10 (canadian institute for advanced research),” 2010.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Yuwen Qian (14 papers)
  2. Shuchi Wu (3 papers)
  3. Kang Wei (41 papers)
  4. Ming Ding (219 papers)
  5. Di Xiao (204 papers)
  6. Tao Xiang (324 papers)
  7. Chuan Ma (35 papers)
  8. Song Guo (138 papers)

Summary

We haven't generated a summary for this paper yet.