Federated Learning Under Attack: Exposing Vulnerabilities through Data Poisoning Attacks in Computer Networks (2403.02983v1)
Abstract: Federated Learning (FL) is a ML approach that enables multiple decentralized devices or edge servers to collaboratively train a shared model without exchanging raw data. During the training and sharing of model updates between clients and servers, data and models are susceptible to different data-poisoning attacks. In this study, our motivation is to explore the severity of data poisoning attacks in the computer network domain because they are easy to implement but difficult to detect. We considered two types of data-poisoning attacks, label flipping (LF) and feature poisoning (FP), and applied them with a novel approach. In LF, we randomly flipped the labels of benign data and trained the model on the manipulated data. For FP, we randomly manipulated the highly contributing features determined using the Random Forest algorithm. The datasets used in this experiment were CIC and UNSW related to computer networks. We generated adversarial samples using the two attacks mentioned above, which were applied to a small percentage of datasets. Subsequently, we trained and tested the accuracy of the model on adversarial datasets. We recorded the results for both benign and manipulated datasets and observed significant differences between the accuracy of the models on different datasets. From the experimental results, it is evident that the LF attack failed, whereas the FP attack showed effective results, which proved its significance in fooling a server. With a 1% LF attack on the CIC, the accuracy was approximately 0.0428 and the ASR was 0.9564; hence, the attack is easily detectable, while with a 1% FP attack, the accuracy and ASR were both approximately 0.9600, hence, FP attacks are difficult to detect. We repeated the experiment with different poisoning percentages.
- J. L. Leevy and T. M. Khoshgoftaar, “A survey and analysis of intrusion detection models based on cse-cic-ids2018 big data,” Journal of Big Data, vol. 7, no. 1, p. 104, 2020.
- N. Moustafa and J. Slay, “Unsw-nb15: a comprehensive data set for network intrusion detection systems (unsw-nb15 network data set),” in 2015 Military Communications and Information Systems Conference (MilCIS), 2015, pp. 1–6.
- V. Shejwalkar, A. Houmansadr, P. Kairouz, and D. Ramage, “Back to the drawing board: A critical evaluation of poisoning attacks on production federated learning,” 2021.
- N. Carlini, C. Liu, Ú. Erlingsson, J. Kos, and D. Song, “The secret sharer: Evaluating and testing unintended memorization in neural networks,” in 28th USENIX Security Symposium (USENIX Security 19). Santa Clara, CA: USENIX Association, Aug. 2019, pp. 267–284.
- H. B. McMahan, D. Ramage, K. Talwar, and L. Zhang, “Learning differentially private recurrent language models,” 2018.
- A. Hard, K. Rao, R. Mathews, S. Ramaswamy, F. Beaufays, S. Augenstein, H. Eichner, C. Kiddon, and D. Ramage, “Federated learning for mobile keyboard prediction,” 2019.
- L. Huang, Y. Yin, Z. Fu, S. Zhang, H. Deng, and D. Liu, “Loadaboost: loss-based adaboost federated machine learning with reduced computational complexity on iid and non-iid intensive care data,” 2020.
- P. Liu, X. Xu, and W. Wang, “Threats, attacks and defenses to federated learning: Issues, taxonomy and perspectives,” Cybersecurity, vol. 5, no. 4, pp. 1–4, 2022. [Online]. Available: https://doi.org/10.1186/s42400-021-00105-6
- L. Lyu, H. Yu, X. Ma, C. Chen, L. Sun, J. Zhao, Q. Yang, and P. S. Yu, “Privacy and robustness in federated learning: Attacks and defenses,” IEEE Transactions on Neural Networks and Learning Systems, pp. 1–21, 2022.
- Y. Shi and Y. E. Sagduyu, “Evasion and causative attacks with adversarial deep learning,” in MILCOM 2017 - 2017 IEEE Military Communications Conference (MILCOM), 2017, pp. 243–248.
- T. Luo, Y. Zhong, and S. Khoo, “Exploreadv: Towards exploratory attack for neural networks,” 2023.
- Y. Shi and Y. Sagduyu, “Evasion and causative attacks with adversarial deep learning,” 10 2017, pp. 243–248.
- M. Barreno, B. Nelson, A. D. Joseph, and J. D. Tygar, “The security of machine learning,” Machine Learning, vol. 81, no. 2, pp. 121–148, 2010. [Online]. Available: https://doi.org/10.1007/s10994-010-5188-5
- C. Online. (2023) How data poisoning attacks corrupt machine learning models. [Online]. Available: https://www.csoonline.com/article/570555/how-data-poisoning-attacks-corrupt-machine-learning-models.html
- X. Cao and N. Z. Gong, “Mpaf: Model poisoning attacks to federated learning based on fake clients,” 2022.
- R. Shokri, M. Stronati, C. Song, and V. Shmatikov, “Membership inference attacks against machine learning models,” 2017.
- M. Nasr, R. Shokri, and A. Houmansadr, “Comprehensive privacy analysis of deep learning: Passive and active white-box inference attacks against centralized and federated learning,” in 2019 IEEE Symposium on Security and Privacy (SP). IEEE, may 2019. [Online]. Available: https://doi.org/10.1109%2Fsp.2019.00065
- A. Paudice, L. Muñoz-González, and E. C. Lupu, “Label sanitization against label flipping poisoning attacks,” 2018.
- U. M. L. Repository, “Iris,” UCI Machine Learning Repository, 1988, DOI: https://doi.org/10.24432/C56C76.
- L. Deng, “The mnist database of handwritten digit images for machine learning research,” IEEE Signal Processing Magazine, vol. 29, no. 6, pp. 141–142, 2012.
- R. E. F. G. Hopkins, Mark and J. Suermondt, “Spambase,” UCI Machine Learning Repository, 1999, DOI: https://doi.org/10.24432/C53G6X.
- L. Muñoz-González, B. Biggio, A. Demontis, A. Paudice, V. Wongrassamee, E. C. Lupu, and F. Roli, “Towards poisoning of deep learning algorithms with back-gradient optimization,” 2017.
- J. Feng, Q.-Z. Cai, and Z.-H. Zhou, “Learning to confuse: Generating training time adversarial data with auto-encoder,” 2019.
- A. Krizhevsky, V. Nair, and G. Hinton, “Cifar-10 (canadian institute for advanced research).” [Online]. Available: http://www.cs.toronto.edu/~kriz/cifar.html
- A. Raza, S. Li, K.-P. Tran, and L. Koehl, “Using anomaly detection to detect poisoning attacks in federated learning applications,” 2023.
- V. Tolpegin, S. Truex, M. E. Gursoy, and L. Liu, “Data poisoning attacks against federated learning systems,” 2020.
- E. Hallaji, R. Razavi-Far, M. Saif, and E. Herrera-Viedma, “Label noise analysis meets adversarial training: A defense against label poisoning in federated learning,” Knowledge-Based Systems, vol. 266, p. 110384, 2023. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S095070512300134X
- R. Alshammari and A. N. Zincir-Heywood, “Can encrypted traffic be identified without port numbers, ip addresses and payload inspection?” Computer Networks, vol. 55, no. 6, pp. 1326–1350, 2011. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1389128610003695
- C. Yang, Q. Wu, H. Li, and Y. Chen, “Generative poisoning attack method against neural networks,” 2017.
- B. Hitaj, G. Ateniese, and F. Perez-Cruz, “Deep models under the gan: Information leakage from collaborative deep learning,” 2017.
- T. D. Nguyen, P. Rieger, M. Miettinen, and A.-R. Sadeghi, “Poisoning attacks on federated learning-based iot intrusion detection system,” 2020. [Online]. Available: https://api.semanticscholar.org/CorpusID:216086694
- M.-O. S. N. Wolberg, William and W. Street, “Breast Cancer Wisconsin (Diagnostic),” UCI Machine Learning Repository, 1995, DOI: https://doi.org/10.24432/C5DW2B.
- D. Sgandurra, L. Muñoz-González, R. Mohsen, and E. C. Lupu, “Automated dynamic analysis of ransomware: Benefits, limitations and use for detection,” 2016.
- J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “Imagenet: A large-scale hierarchical image database,” in 2009 IEEE conference on computer vision and pattern recognition. Ieee, 2009, pp. 248–255.
- G. Moody and R. Mark, “The impact of the mit-bih arrhythmia database,” IEEE Engineering in Medicine and Biology Magazine, vol. 20, no. 3, pp. 45–50, 2001.
- A. D. G. A. O. L. Reyes-Ortiz, Jorge and X. Parra, “Human Activity Recognition Using Smartphones,” UCI Machine Learning Repository, 2012, DOI: https://doi.org/10.24432/C54S4K.
- E. Nowroozi, M. Mohammadi, P. Golmohammadi, Y. Mekdad, M. Conti, and S. Uluagac, “Resisting deep learning models against adversarial attack transferability via feature randomization,” 2022.
- X.-H. Nguyen and K.-H. Le, “Robust detection of unknown dos/ddos attacks in iot networks using a hybrid learning model,” Internet of Things, vol. 23, p. 100851, 2023.
- nfstream, “nfstream - network traffic and flow analysis,” https://www.nfstream.org/, 2023, accessed: 21-09-2023.
- Pytorch documentation. Accessed on September 10, 2023. [Online]. Available: https://pytorch.org/get-started/locally/
- I. H. Ehsan Nowroozi, “Federated learning under attack: Exposing vulnerabilities through data poisoning attacks in computer networks,” https://github.com/ehsannowroozi/FederatedLearning_Poison_LF_FP, 2023, accessed: November 7, 2023.
- E. I. Polat. (2020) Federated learning: A simple implementation of fedavg (federated averaging) with pytorch. Accessed: September 27, 2023. [Online]. Available: https://tinyurl.com/2kvkbe98
- S. learn contributors, “Feature importances with a forest of trees,” Scikit-learn documentation, 2023. [Online]. Available: https://scikit-learn.org/stable/auto_examples/ensemble/plot_forest_importances.html
- Ehsan Nowroozi (19 papers)
- Imran Haider (2 papers)
- Rahim Taheri (7 papers)
- Mauro Conti (195 papers)