Advocating for the Silent: Enhancing Federated Generalization for Non-Participating Clients (2310.07171v7)
Abstract: Federated Learning (FL) has surged in prominence due to its capability of collaborative model training without direct data sharing. However, the vast disparity in local data distributions among clients, often termed the Non-Independent Identically Distributed (Non-IID) challenge, poses a significant hurdle to FL's generalization efficacy. The scenario becomes even more complex when not all clients participate in the training process, a common occurrence due to unstable network connections or limited computational capacities. This can greatly complicate the assessment of the trained models' generalization abilities. While a plethora of recent studies has centered on the generalization gap pertaining to unseen data from participating clients with diverse distributions, the distinction between the training distributions of participating clients and the testing distributions of non-participating ones has been largely overlooked. In response, our paper unveils an information-theoretic generalization framework for FL. Specifically, it quantifies generalization errors by evaluating the information entropy of local distributions and discerning discrepancies across these distributions. Inspired by our deduced generalization bounds, we introduce a weighted aggregation approach and a duo of client selection strategies. These innovations are designed to strengthen FL's ability to generalize and thus ensure that trained models perform better on non-participating clients by incorporating a more diverse range of client data distributions. Our extensive empirical evaluations reaffirm the potency of our proposed methods, aligning seamlessly with our theoretical construct.
- H. Zhu, J. Xu, S. Liu, and Y. Jin, “Federated learning on non-iid data: A survey,” Neurocomputing, vol. 465, pp. 371–390, 2021.
- B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y Arcas, “Communication-efficient learning of deep networks from decentralized data,” in Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, AISTATS 2017, 20-22 April 2017, Fort Lauderdale, FL, USA (A. Singh and X. J. Zhu, eds.), vol. 54 of Proceedings of Machine Learning Research, pp. 1273–1282, PMLR, 2017.
- Y. Yan, X. Tong, and S. Wang, “Clustered federated learning in heterogeneous environment,” IEEE Transactions on Neural Networks and Learning Systems, pp. 1–14, 2023.
- K. K. Coelho, M. Nogueira, A. B. Vieira, E. F. Silva, and J. A. M. Nacif, “A survey on federated learning for security and privacy in healthcare applications,” Comput. Commun., vol. 207, pp. 113–127, 2023.
- Q. Mao, S. Wan, D. Hu, J. Yan, J. Hu, and X. Yang, “Leveraging federated learning for unsecured loan risk assessment on decentralized finance lending platforms,” in ICDM (Workshops), pp. 663–670, IEEE, 2023.
- Z. Sun, Y. Xu, Y. Liu, W. He, L. Kong, F. Wu, Y. Jiang, and L. Cui, “A survey on federated recommendation systems,” IEEE Transactions on Neural Networks and Learning Systems, pp. 1–15, 2024.
- Z. Wu, X. Wu, and Y. Long, “Joint scheduling and robust aggregation for federated localization over unreliable wireless D2D networks,” IEEE Trans. Netw. Serv. Manag., vol. 20, no. 3, pp. 3359–3379, 2023.
- Y. Wang, Q. Shi, and T.-H. Chang, “Why batch normalization damage federated learning on non-iid data?,” IEEE Transactions on Neural Networks and Learning Systems, pp. 1–15, 2023.
- X. Hu, S. Li, and Y. Liu, “Generalization bounds for federated learning: Fast rates, unparticipating clients and unbounded losses,” in International Conference on Learning Representations, 2023.
- Y. Zhao, M. Li, L. Lai, N. Suda, D. Civin, and V. Chandra, “Federated learning with non-iid data,” arXiv preprint arXiv:1806.00582, 2018.
- H. Yuan, W. Morningstar, L. Ning, and K. Singhal, “What do we mean by generalization in federated learning?,” arXiv preprint arXiv:2110.14216, 2021.
- M. Mohri, G. Sivek, and A. T. Suresh, “Agnostic federated learning,” in Proceedings of the 36th International Conference on Machine Learning, ICML 2019, 9-15 June 2019, Long Beach, California, USA (K. Chaudhuri and R. Salakhutdinov, eds.), vol. 97 of Proceedings of Machine Learning Research, pp. 4615–4625, PMLR, 2019.
- Z. Qu, X. Li, R. Duan, Y. Liu, B. Tang, and Z. Lu, “Generalized federated learning via sharpness aware minimization,” in International Conference on Machine Learning, ICML 2022, 17-23 July 2022, Baltimore, Maryland, USA (K. Chaudhuri, S. Jegelka, L. Song, C. Szepesvári, G. Niu, and S. Sabato, eds.), vol. 162 of Proceedings of Machine Learning Research, pp. 18250–18280, PMLR, 2022.
- B. Wei, J. Li, Y. Liu, and W. Wang, “Non-iid federated learning with sharper risk bound,” IEEE Transactions on Neural Networks and Learning Systems, 2022.
- W. Y. B. Lim, N. C. Luong, D. T. Hoang, Y. Jiao, Y.-C. Liang, Q. Yang, D. Niyato, and C. Miao, “Federated learning in mobile edge networks: A comprehensive survey,” IEEE Communications Surveys & Tutorials, vol. 22, no. 3, pp. 2031–2063, 2020.
- M. Wang, Y. Pan, X. Yang, G. Li, and Z. Xu, “Tensor networks meet neural networks: A survey,” CoRR, vol. abs/2302.09019, 2023.
- Y. Pan, Z. Su, A. Liu, J. Wang, N. Li, and Z. Xu, “A unified weight initialization paradigm for tensorial convolutional neural networks,” in ICML, vol. 162 of Proceedings of Machine Learning Research, pp. 17238–17257, PMLR, 2022.
- A. Tak and S. Cherkaoui, “Federated edge learning: Design issues and challenges,” IEEE Network, vol. 35, no. 2, pp. 252–258, 2020.
- Y. Wang, Z. Su, T. H. Luan, R. Li, and K. Zhang, “Federated learning with fair incentives and robust aggregation for uav-aided crowdsensing,” IEEE Transactions on Network Science and Engineering, vol. 9, no. 5, pp. 3179–3196, 2021.
- A. Reisizadeh, F. Farnia, R. Pedarsani, and A. Jadbabaie, “Robust federated learning: The case of affine distribution shifts,” in Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual (H. Larochelle, M. Ranzato, R. Hadsell, M. Balcan, and H. Lin, eds.), 2020.
- X. Ma, J. Zhu, Z. Lin, S. Chen, and Y. Qin, “A state-of-the-art survey on solving non-iid data in federated learning,” Future Generation Computer Systems, vol. 135, pp. 244–258, 2022.
- D. Caldarola, B. Caputo, and M. Ciccone, “Improving generalization in federated learning by seeking flat minima,” in Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXIII, pp. 654–672, Springer, 2022.
- S. Yagli, A. Dytso, and H. V. Poor, “Information-theoretic bounds on the generalization error and privacy leakage in federated learning,” in 21st IEEE International Workshop on Signal Processing Advances in Wireless Communications, SPAWC 2020, Atlanta, GA, USA, May 26-29, 2020, pp. 1–5, IEEE, 2020.
- L. P. Barnes, A. Dytso, and H. V. Poor, “Improved information theoretic generalization bounds for distributed and federated learning,” in IEEE International Symposium on Information Theory, ISIT 2022, Espoo, Finland, June 26 - July 1, 2022, pp. 1465–1470, IEEE, 2022.
- M. Sefidgaran, R. Chor, and A. Zaidi, “Rate-distortion theoretic bounds on generalization error for distributed learning,” in Advances in Neural Information Processing Systems (A. H. Oh, A. Agarwal, D. Belgrave, and K. Cho, eds.), 2022.
- Z. Wu, Z. Xu, H. Yu, and J. Liu, “Information-theoretic generalization analysis for topology-aware heterogeneous federated edge learning over noisy channels,” IEEE Signal Processing Letters, pp. 1–5, 2023.
- R. Zhang, Q. Xu, J. Yao, Y. Zhang, Q. Tian, and Y. Wang, “Federated domain generalization with generalization adjustment,” in CVPR, pp. 3954–3963, IEEE, 2023.
- A. T. Nguyen, P. H. S. Torr, and S. N. Lim, “Fedsr: A simple and effective domain generalization method for federated learning,” in NeurIPS, 2022.
- A. T. Nguyen, P. Torr, and S.-N. Lim, “Fedsr: A simple and effective domain generalization method for federated learning,” in Advances in Neural Information Processing Systems, 2022.
- B. Li, Y. Shen, Y. Wang, W. Zhu, C. Reed, D. Li, K. Keutzer, and H. Zhao, “Invariant information bottleneck for domain generalization,” in Thirty-Sixth AAAI Conference on Artificial Intelligence, AAAI 2022, Thirty-Fourth Conference on Innovative Applications of Artificial Intelligence, IAAI 2022, The Twelveth Symposium on Educational Advances in Artificial Intelligence, EAAI 2022 Virtual Event, February 22 - March 1, 2022, pp. 7399–7407, AAAI Press, 2022.
- K. Muandet, D. Balduzzi, and B. Schölkopf, “Domain generalization via invariant feature representation,” in International conference on machine learning, pp. 10–18, PMLR, 2013.
- L. Paninski, “Estimation of entropy and mutual information,” Neural computation, vol. 15, no. 6, pp. 1191–1253, 2003.
- Y. Zou, Z. Wang, X. Chen, H. Zhou, and Y. Zhou, “Knowledge-guided learning for transceiver design in over-the-air federated learning,” IEEE Transactions on Wireless Communications, vol. 22, no. 1, pp. 270–285, 2023.
- T. Li, A. K. Sahu, M. Zaheer, M. Sanjabi, A. Talwalkar, and V. Smith, “Federated optimization in heterogeneous networks,” Proceedings of Machine learning and systems, vol. 2, pp. 429–450, 2020.
- D. Zeng, X. Hu, S. Liu, Y. Yu, Q. Wang, and Z. Xu, “Stochastic clustered federated learning,” 2023.
- S. P. Boyd and L. Vandenberghe, Convex optimization. Cambridge university press, 2004.
- C. B. Barber, D. P. Dobkin, and H. Huhdanpaa, “The quickhull algorithm for convex hulls,” ACM Transactions on Mathematical Software (TOMS), vol. 22, no. 4, pp. 469–483, 1996.
- S. Caldas, P. Wu, T. Li, J. Konečný, H. B. McMahan, V. Smith, and A. Talwalkar, “LEAF: A benchmark for federated settings,” CoRR, vol. abs/1812.01097, 2018.
- Y. J. Cho, J. Wang, and G. Joshi, “Client selection in federated learning: Convergence analysis and power-of-choice selection strategies,” CoRR, vol. abs/2010.01243, 2020.