Dynamic Client Clustering, Bandwidth Allocation, and Workload Optimization for Semi-synchronous Federated Learning (2403.06900v1)
Abstract: Federated Learning (FL) revolutionizes collaborative machine learning among Internet of Things (IoT) devices by enabling them to train models collectively while preserving data privacy. FL algorithms fall into two primary categories: synchronous and asynchronous. While synchronous FL efficiently handles straggler devices, it can compromise convergence speed and model accuracy. In contrast, asynchronous FL allows all devices to participate but incurs high communication overhead and potential model staleness. To overcome these limitations, the semi-synchronous FL framework introduces client tiering based on computing and communication latencies. Clients in different tiers upload their local models at distinct frequencies, striking a balance between straggler mitigation and communication costs. Enter the DecantFed algorithm (Dynamic client clustering, bandwidth allocation, and local training for semi-synchronous Federated learning), a dynamic solution that optimizes client clustering, bandwidth allocation, and local training workloads to maximize data sample processing rates. Additionally, DecantFed adapts client learning rates according to their tiers, addressing the model staleness problem. The algorithm's performance shines in extensive simulations using benchmark datasets, including MNIST and CIFAR-10, under independent and identically distributed (IID) and non-IID scenarios. DecantFed outpaces FedAvg and FedProx in terms of convergence speed and delivers a remarkable minimum 28% boost in model accuracy compared to FedProx.
- N. ANSARI and X. SUN, “Mobile edge computing empowers internet of things,” IEICE Transactions on Communications, vol. E101.B, no. 3, pp. 604–619, 2018.
- X. Sun and N. Ansari, “EdgeIoT: Mobile edge computing for the Internet of Things,” IEEE Commun. Mag., vol. 54, no. 12, pp. 22–29, 2016.
- B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y Arcas, “Communication-Efficient Learning of Deep Networks from Decentralized Data,” in Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, ser. Proceedings of Machine Learning Research, A. Singh and J. Zhu, Eds., vol. 54. Fort Lauderdale, FL, USA: PMLR, 20–22 Apr 2017, pp. 1273–1282. [Online]. Available: http://proceedings.mlr.press/v54/mcmahan17a.html
- T. T. Vu, D. T. Ngo, H. Q. Ngo, M. N. Dao, N. H. Tran, and R. H. Middleton, “Straggler effect mitigation for federated learning in cell-free massive mimo,” in ICC 2021 - IEEE International Conference on Communications, 2021, pp. 1–6.
- R. Albelaihi, X. Sun, W. D. Craft, L. Yu, and C. Wang, “Adaptive participant selection in heterogeneous federated learning,” in 2021 IEEE Global Communications Conference (GLOBECOM), 2021, pp. 1–6.
- J. Wang and G. Joshi, “Cooperative sgd: A unified framework for the design and analysis of communication-efficient sgd algorithms,” arXiv preprint arXiv:1808.07576, 2018.
- T. Nishio and R. Yonetani, “Client selection for federated learning with heterogeneous resources in mobile edge,” in ICC 2019 - 2019 IEEE International Conference on Communications (ICC), 2019, pp. 1–7.
- C. Xu, Y. Qu, Y. Xiang, and L. Gao, “Asynchronous federated learning on heterogeneous devices: A survey,” arXiv preprint arXiv:2109.04269, 2021.
- G. Damaskinos, R. Guerraoui, A.-M. Kermarrec, V. Nitu, R. Patra, and F. Taiani, “Fleet: Online federated learning via staleness awareness and performance prediction,” in Proceedings of the 21st International Middleware Conference, ser. Middleware ’20. New York, NY, USA: Association for Computing Machinery, 2020, p. 163–177.
- Z. Gao, Y. Duan, Y. Yang, L. Rui, and C. Zhao, “Fedsec: a robust differential private federated learning framework in heterogeneous networks,” in 2022 IEEE Wireless Communications and Networking Conference (WCNC), 2022, pp. 1868–1873.
- A. Albaseer, M. Abdallah, A. Al-Fuqaha, and aiman Erbad, “Data-Driven Participant Selection and Bandwidth Allocation for Heterogeneous Federated Edge Learning,” 3 2022.
- T. Li, A. K. Sahu, M. Zaheer, M. Sanjabi, A. Talwalkar, and V. Smith, “Federated optimization in heterogeneous networks,” in Proceedings of Machine Learning and Systems, I. Dhillon, D. Papailiopoulos, and V. Sze, Eds., vol. 2, 2020, pp. 429–450.
- K. Bonawitz, H. Eichner, W. Grieskamp, D. Huba, A. Ingerman, V. Ivanov, C. Kiddon, J. Konečný, S. Mazzocchi, H. B. McMahan, T. V. Overveldt, D. Petrou, D. Ramage, and J. Roselander, “Towards federated learning at scale: System design,” 2019.
- R. Albelaihi, L. Yu, W. D. Craft, X. Sun, C. Wang, and R. Gazda, “Green federated learning via energy-aware client selection,” in GLOBECOM 2022 - 2022 IEEE Global Communications Conference, 2022, pp. 13–18.
- L. Yu, R. Albelaihi, X. Sun, N. Ansari, and M. Devetsikiotis, “Jointly optimizing client selection and resource management in wireless federated learning for internet of things,” IEEE Internet of Things Journal, pp. 1–1, 2021.
- J. Xu and H. Wang, “Client selection and bandwidth allocation in wireless federated learning networks: A long-term perspective,” IEEE Transactions on Wireless Communications, vol. 20, no. 2, pp. 1188–1200, 2021.
- J. Nguyen, K. Malik, H. Zhan, A. Yousefpour, M. Rabbat, M. Malek, and D. Huba, “Federated learning with buffered asynchronous aggregation,” in Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, ser. Proceedings of Machine Learning Research, G. Camps-Valls, F. J. R. Ruiz, and I. Valera, Eds., vol. 151. PMLR, 28–30 Mar 2022, pp. 3581–3607.
- Z. Wang, Z. Zhang, Y. Tian, Q. Yang, H. Shan, W. Wang, and T. Q. S. Quek, “Asynchronous federated learning over wireless communication networks,” IEEE Transactions on Wireless Communications, vol. 21, no. 9, pp. 6961–6978, 2022.
- J. Jiang, B. Cui, C. Zhang, and L. Yu, “Heterogeneity-aware distributed parameter servers,” in Proceedings of the 2017 ACM International Conference on Management of Data, ser. SIGMOD ’17. New York, NY, USA: Association for Computing Machinery, 2017, p. 463–478. [Online]. Available: https://doi.org/10.1145/3035918.3035933
- W. Zhang, S. Gupta, X. Lian, and J. Liu, “Staleness-aware async-sgd for distributed deep learning,” in Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, ser. IJCAI’16. AAAI Press, 2016, p. 2350–2356.
- C. Xie, S. Koyejo, and I. Gupta, “Asynchronous federated optimization,” arXiv preprint arXiv:1903.03934, 2019.
- Y. Chen, Y. Ning, M. Slawski, and H. Rangwala, “Asynchronous online federated learning for edge devices with non-iid data,” in 2020 IEEE International Conference on Big Data (Big Data), 2020, pp. 15–24.
- Z. Yang, M. Chen, W. Saad, C. S. Hong, M. Shikh-Bahaei, H. V. Poor, and S. Cui, “Delay Minimization for Federated Learning Over Wireless Communication Networks,” arXiv e-prints, p. arXiv:2007.03462, Jul. 2020.
- L. Yu, X. Sun, R. Albelaihi, and C. Yi, “Latency aware semi-synchronous client selection and model aggregation for wireless federated learning,” 2022. [Online]. Available: https://arxiv.org/abs/2210.10311
- Liangkun Yu (3 papers)
- Xiang Sun (26 papers)
- Rana Albelaihi (3 papers)
- Chaeeun Park (1 paper)
- Sihua Shao (10 papers)