Federated Behavioural Planes: Explaining the Evolution of Client Behaviour in Federated Learning (2405.15632v2)
Abstract: Federated Learning (FL), a privacy-aware approach in distributed deep learning environments, enables many clients to collaboratively train a model without sharing sensitive data, thereby reducing privacy risks. However, enabling human trust and control over FL systems requires understanding the evolving behaviour of clients, whether beneficial or detrimental for the training, which still represents a key challenge in the current literature. To address this challenge, we introduce Federated Behavioural Planes (FBPs), a novel method to analyse, visualise, and explain the dynamics of FL systems, showing how clients behave under two different lenses: predictive performance (error behavioural space) and decision-making processes (counterfactual behavioural space). Our experiments demonstrate that FBPs provide informative trajectories describing the evolving states of clients and their contributions to the global model, thereby enabling the identification of clusters of clients with similar behaviours. Leveraging the patterns identified by FBPs, we propose a robust aggregation technique named Federated Behavioural Shields to detect malicious or noisy client models, thereby enhancing security and surpassing the efficacy of existing state-of-the-art FL defense mechanisms. Our code is publicly available on GitHub.
- Health insurance portability and accountability act. Public Law 104-191, 104th Congress, 1996.
- General data protection regulation. Regulation (EU) 2016/679 of the European Parliament and of the Council, 2018.
- Peter et al. Kairouz. Advances and open problems in federated learning. Foundations and Trends® in Machine Learning, 14(1-2):1–210, 2021.
- Learning under concept drift: A review. IEEE Transactions on Knowledge and Data Engineering, 31(12):2346–2363, 2018.
- Bias propagation in federated learning. arXiv preprint arXiv:2309.02160, 2023.
- Threats, attacks and defenses to federated learning: issues, taxonomy and perspectives. Cybersecurity, 5(1):4, 2022.
- Privacy and robustness in federated learning: Attacks and defenses. IEEE Transactions on Neural Networks and Learning Systems, 2022.
- Threats to federated learning: A survey. arXiv preprint arXiv:2003.02133, 2020.
- Analyzing federated learning through an adversarial lens. In International Conference on Machine Learning. PMLR, 2019.
- Manipulating the byzantine: Optimizing model poisoning attacks and defenses for federated learning. In Proceedings of the Network and Distributed System Security Symposium (NDSS), 2021.
- Byzantine-robust distributed learning: Towards optimal statistical rates. In Proceedings of the International Conference on Machine Learning, ICML. PMLR, 2018.
- Peva Blanchard et al. Machine learning with adversaries: Byzantine tolerant gradient descent. In Advances in Neural Information Processing Systems, volume 30, 2017.
- The hidden vulnerability of distributed learning in byzantium. In International Conference on Machine Learning. PMLR, 2018.
- Local model poisoning attacks to Byzantine-Robust federated learning. In 29th USENIX Security Symposium (USENIX Security 20), 2020.
- Yi Zeng and et al. Meta-Sift: How to sift out a clean subset in the presence of data poisoning? In 32nd USENIX Security Symposium (USENIX Security 23), 2023.
- A survey of semantic methods in genetic programming. Genetic Programming and Evolvable Machines, 15(2):195–214, January 2014. ISSN 1573-7632. doi: 10.1007/s10710-013-9210-0. URL http://dx.doi.org/10.1007/s10710-013-9210-0.
- Geometric Semantic Genetic Programming, page 21–31. Springer Berlin Heidelberg, 2012. ISBN 9783642329371. doi: 10.1007/978-3-642-32937-1_3. URL http://dx.doi.org/10.1007/978-3-642-32937-1_3.
- Illuminating search spaces by mapping elites. CoRR, abs/1504.04909, 2015. URL http://arxiv.org/abs/1504.04909.
- Counterfactual explanations without opening the black box: Automated decisions and the gdpr. Harv. JL & Tech., 31:841, 2017.
- Communication-Efficient Learning of Deep Networks from Decentralized Data. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, pages 1273–1282. PMLR, April 2017.
- Poison frogs! targeted clean-label poisoning attacks on neural networks. In Advances in Neural Information Processing Systems, volume 31, 2018.
- How to backdoor federated learning. In International Conference on Artificial Intelligence and Statistics. PMLR, 2020.
- A little is enough: Circumventing defenses for distributed learning. In Advances in Neural Information Processing Systems, volume 32, 2019.
- Fall of empires: Breaking byzantine-tolerant sgd by inner product manipulation. In Uncertainty in Artificial Intelligence. PMLR, 2020.
- A novel data poisoning attack in federated learning based on inverted loss function. Computers & Security, 130:103270, 2023.
- Learning to reweight examples for robust deep learning. In International Conference on Machine Learning. PMLR, 2018.
- Attack of the tails: Yes, you really can backdoor federated learning. Advances in Neural Information Processing Systems, 33:16070–16084, 2020.
- Zeno: Distributed stochastic gradient descent with suspicion-based fault-tolerance. In International Conference on Machine Learning, page PMLR, 2019.
- Suyi Li et al. Abnormal client behavior detection in federated learning. arXiv preprint arXiv:1910.09933, 2019.
- Xiaoyu Cao et al. Fltrust: Byzantine-robust federated learning via trust bootstrapping. arXiv preprint arXiv:2012.13995, 2020.
- Vadaf: Visualization for abnormal client detection and analysis in federated learning. ACM Transactions on Interactive Intelligent Systems (TiiS), 11(3-4):1–23, 2021.
- Sentinet: Detecting localized universal attacks against deep learning systems. In 2020 IEEE Security and Privacy Workshops (SPW). IEEE, 2020.
- MAP-Elites with Cosine-Similarity for Evolutionary Ensemble Learning. In 26th European Conference, EuroGP, volume 13986 of Lecture Notes in Computer Science, pages 84–100, Brno, Czech Republic, April 2023. Springer Nature Switzerland. doi: 10.1007/978-3-031-29573-7\_6. URL https://hal.science/hal-04230184.
- Vcnet: A self-explaining model for realistic counterfactual generation. In Massih-Reza Amini, Stéphane Canu, Asja Fischer, Tias Guns, Petra Kralj Novak, and Grigorios Tsoumakas, editors, Machine Learning and Knowledge Discovery in Databases, pages 437–453, Cham, 2023. Springer International Publishing. ISBN 978-3-031-26387-3.
- Learning model-agnostic counterfactual explanations for tabular data. In Proceedings of The Web Conference 2020, WWW ’20. ACM, April 2020. doi: 10.1145/3366423.3380087. URL http://dx.doi.org/10.1145/3366423.3380087.
- Breast Cancer Wisconsin (Diagnostic). UCI Machine Learning Repository, 1995. DOI: https://doi.org/10.24432/C5DW2B.
- Centers for Disease Control and Prevention (CDC). CDC Diabetes Health Indicators Dataset, 2017.
- Li Deng. The mnist database of handwritten digit images for machine learning research. IEEE Signal Processing Magazine, 29(6):141–142, 2012.
- Yanli Li et al. Enhancing federated learning robustness through clustering non-iid features. In Proceedings of the Asian Conference on Computer Vision, 2022.
- Junyu Shi et al. Challenges and approaches for mitigating byzantine attacks in federated learning. In 2022 IEEE International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom). IEEE, 2022.
- Vale Tolpegin et al. Data poisoning attacks against federated learning systems. In Proceedings of the 25th European Symposium on Research in Computer Security (ESORICS), volume 25 of Computer Security - ESORICS 2020, pages 25–40, Guildford, UK, 2020. Springer International Publishing.
- Free-riders in federated learning: Attacks and defenses. arXiv preprint arXiv:1911.12560, 2019.
- Byzantine-robust learning on heterogeneous datasets via bucketing. arXiv preprint arXiv:2006.09365, 2020.
- Xumeng Wang et al. Hetvis: A visual analysis approach for identifying data heterogeneity in horizontal federated learning. IEEE Transactions on Visualization and Computer Graphics, 29(1):310–319, 2022.
- Quan Li et al. Inspecting the running process of horizontal federated learning via visual analytics. IEEE Transactions on Visualization and Computer Graphics, 28(12):4085–4100, 2021.
- The limitations of federated learning in sybil settings. In 23rd International Symposium on Research in Attacks, Intrusions and Defenses (RAID 2020), 2020.
- Distributed statistical machine learning in adversarial settings: Byzantine gradient descent. In Abstracts of the 2018 ACM International Conference on Measurement and Modeling of Computer Systems, 2018.
- Yunlong Mao et al. Romoa: Robust model aggregation for the resistance of federated learning to model poisoning attacks. In Computer Security–ESORICS 2021: 26th European Symposium on Research in Computer Security, Darmstadt, Germany, October 4-8, 2021, Proceedings, Part I, volume 26. Springer International Publishing, 2021.
- Attacks against federated learning defense systems and their mitigation. Journal of Machine Learning Research, 24(30):1–50, 2023.
- Najeeb Moharram Jebreel et al. Lfighter: Defending against the label-flipping attack in federated learning. Neural Networks, 170:111–126, 2024.
- An efficient framework for clustered federated learning. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin, editors, Advances in Neural Information Processing Systems, volume 33, pages 19586–19597. Curran Associates, Inc., 2020. URL https://proceedings.neurips.cc/paper_files/paper/2020/file/e32cc80bf07915058ce90722ee17bb71-Paper.pdf.
- Flis: Clustered federated learning via inference similarity for non-iid data distribution. IEEE Open Journal of the Computer Society, 4:109–120, 2023. doi: 10.1109/OJCS.2023.3262203.
- Clustered federated learning: Model-agnostic distributed multitask optimization under privacy constraints. IEEE Transactions on Neural Networks and Learning Systems, 32(8):3710–3722, 2021. doi: 10.1109/TNNLS.2020.3015958.
- Federated learning with hierarchical clustering of local updates to improve training on non-iid data. In 2020 International Joint Conference on Neural Networks (IJCNN), pages 1–9, 2020. doi: 10.1109/IJCNN48605.2020.9207469.
- Multi-center federated learning: clients clustering for better personalization. World Wide Web, 26(1):481–500, 2023. doi: 10.1007/s11280-022-01046-x. URL https://doi.org/10.1007/s11280-022-01046-x.
- Adaptive client clustering for efficient federated learning over non-iid and imbalanced data. IEEE Transactions on Big Data, pages 1–1, 2022. doi: 10.1109/TBDATA.2022.3167994.
- Flexible clustered federated learning for client-level data distribution shift. IEEE Transactions on Parallel and Distributed Systems, 33(11):2661–2674, 2022. doi: 10.1109/TPDS.2021.3134263.
- Ramprasaath R. Selvaraju et al. Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE International Conference on Computer Vision, 2017.
- Auto-encoding variational bayes, 2022.
- K-means clustering. In Claude Sammut and Geoffrey I. Webb, editors, Encyclopedia of Machine Learning, pages 563–564. Springer US, Boston, MA, 2010. ISBN 978-0-387-30164-8. doi: 10.1007/978-0-387-30164-8_425.
- Flower: A friendly federated learning research framework. arXiv preprint arXiv:2007.14390, 2020.
- Pytorch: An imperative style, high-performance deep learning library. Advances in Neural Information Processing Systems, 32, 2019.
- Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825–2830, 2011.
- J. D. Hunter. Matplotlib: A 2d graphics environment. Computing in Science & Engineering, 9(3):90–95, 2007. doi: 10.1109/MCSE.2007.55.
- Michael L. Waskom. seaborn: statistical data visualization. Journal of Open Source Software, 6(60):3021, 2021. doi: 10.21105/joss.03021. URL https://doi.org/10.21105/joss.03021.
- Wes McKinney. Data Structures for Statistical Computing in Python. In Stéfan van der Walt and Jarrod Millman, editors, Proceedings of the 9th Python in Science Conference, pages 56 – 61, 2010. doi: 10.25080/Majora-92bf1922-00a.