Coordination-free Decentralised Federated Learning on Complex Networks: Overcoming Heterogeneity (2312.04504v1)
Abstract: Federated Learning (FL) is a well-known framework for successfully performing a learning task in an edge computing scenario where the devices involved have limited resources and incomplete data representation. The basic assumption of FL is that the devices communicate directly or indirectly with a parameter server that centrally coordinates the whole process, overcoming several challenges associated with it. However, in highly pervasive edge scenarios, the presence of a central controller that oversees the process cannot always be guaranteed, and the interactions (i.e., the connectivity graph) between devices might not be predetermined, resulting in a complex network structure. Moreover, the heterogeneity of data and devices further complicates the learning process. This poses new challenges from a learning standpoint that we address by proposing a communication-efficient Decentralised Federated Learning (DFL) algorithm able to cope with them. Our solution allows devices communicating only with their direct neighbours to train an accurate model, overcoming the heterogeneity induced by data and different training histories. Our results show that the resulting local models generalise better than those trained with competing approaches, and do so in a more communication-efficient way.
- W. Shi and S. Dustdar, “The promise of edge computing,” Computer, vol. 49, no. 5, pp. 78–81, May 2016.
- M. Ye, X. Fang, B. Du, P. C. Yuen, and D. Tao, “Heterogeneous federated learning: State-of-the-art and research challenges,” ACM Comput. Surv., vol. 56, no. 3, pp. 1–44, Oct. 2023.
- T. Nishio and R. Yonetani, “Client selection for federated learning with heterogeneous resources in mobile edge,” in ICC 2019 - 2019 IEEE International Conference on Communications (ICC). IEEE, May 2019, pp. 1–7.
- H. Jamali-Rad, M. Abdizadeh, and A. Singh, “Federated learning with taskonomy for Non-IID data,” IEEE Trans Neural Netw Learn Syst, vol. PP, Mar. 2022.
- A. Lalitha, S. Shekhar, T. Javidi, and F. Koushanfar, “Fully decentralized federated learning,” in Third workshop on bayesian deep learning (NeurIPS), vol. 2, 2018.
- S. Boyd, A. Ghosh, B. Prabhakar, and D. Shah, “Gossip algorithms: design, analysis and applications,” in Proceedings IEEE 24th Annual Joint Conference of the IEEE Computer and Communications Societies., vol. 3. IEEE, 2005, pp. 1653–1664 vol. 3.
- T. C. Aysal, M. E. Yildiz, A. D. Sarwate, and A. Scaglione, “Broadcast Gossip Algorithms for Consensus,” IEEE Trans. Signal Process., vol. 57, no. 7, pp. 2748–2761, Jul. 2009.
- A. Nedić, A. Olshevsky, and W. Shi, “Achieving geometric convergence for distributed optimization over time-varying graphs,” SIAM J. Optim., vol. 27, no. 4, pp. 2597–2633, Jan. 2017.
- K. Yuan, Q. Ling, and W. Yin, “On the convergence of decentralized gradient descent,” SIAM J. Optim., vol. 26, no. 3, pp. 1835–1854, Jan. 2016.
- J. Zeng and W. Yin, “On Nonconvex Decentralized Gradient Descent,” IEEE Trans. Signal Process., vol. 66, no. 11, pp. 2834–2848, Jun. 2018.
- B. Sirb and X. Ye, “Consensus optimization with delayed and stochastic gradients on decentralized networks,” in 2016 IEEE International Conference on Big Data (Big Data). IEEE, Dec. 2016, pp. 76–85.
- X. Lian, C. Zhang, H. Zhang, C.-J. Hsieh, W. Zhang, and J. Liu, “Can decentralized algorithms outperform centralized algorithms? a case study for decentralized parallel stochastic gradient descent,” Adv. Neural Inf. Process. Syst., vol. 30, 2017.
- X. Lian, W. Zhang, C. Zhang, and J. Liu, “Asynchronous Decentralized Parallel Stochastic Gradient Descent,” ArXiv, 2017.
- A. G. Roy, S. Siddiqui, S. Pölsterl, N. Navab, and C. Wachinger, “BrainTorrent: A Peer-to-Peer Environment for Decentralized Federated Learning,” pp. 1–9, May 2019.
- A. Lalitha, X. Wang, O. Kilinc, Y. Lu, T. Javidi, and F. Koushanfar, “Decentralized bayesian learning over graphs,” ArXiv, no. i, 2019.
- S. Savazzi, M. Nicoli, and V. Rampa, “Federated Learning With Cooperating Devices: A Consensus Approach for Massive IoT Networks,” IEEE Internet of Things Journal, vol. 7, no. 5, pp. 4641–4654, May 2020.
- T. Sun, D. Li, and B. Wang, “Decentralized federated averaging,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 45, no. 4, pp. 4289–4301, Apr. 2023.
- P. Bellavista, L. Foschini, and A. Mora, “Decentralised learning in federated deployment environments: A System-Level survey,” ACM Comput. Surv., vol. 54, no. 1, pp. 1–38, Feb. 2021.
- P. Kairouz, H. B. McMahan, B. Avent, A. Bellet, M. Bennis, A. Nitin Bhagoji, K. Bonawitz, Z. Charles, G. Cormode, R. Cummings, R. G. L. D’Oliveira, H. Eichner, S. El Rouayheb, D. Evans, J. Gardner, Z. Garrett, A. Gascón, B. Ghazi, P. B. Gibbons, M. Gruteser, Z. Harchaoui, C. He, L. He, Z. Huo, B. Hutchinson, J. Hsu, M. Jaggi, T. Javidi, G. Joshi, M. Khodak, J. Konecný, A. Korolova, F. Koushanfar, S. Koyejo, T. Lepoint, Y. Liu, P. Mittal, M. Mohri, R. Nock, A. Özgür, R. Pagh, H. Qi, D. Ramage, R. Raskar, M. Raykova, D. Song, W. Song, S. U. Stich, Z. Sun, A. T. Suresh, F. Tramèr, P. Vepakomma, J. Wang, L. Xiong, Z. Xu, Q. Yang, F. X. Yu, H. Yu, and S. Zhao, “Advances and open problems in federated learning,” Foundations and Trends® in Machine Learning, vol. 14, no. 1–2, pp. 1–210, 2021.
- J. t. a. Wang, “A field guide to federated optimization,” Jul. 2021.
- H. Brendan McMahan, E. Moore, D. Ramage, S. Hampson, and B. Agüera y Arcas, “Communication-Efficient learning of deep networks from decentralized data,” Feb. 2016.
- X. Li, K. Huang, W. Yang, S. Wang, and Z. Zhang, “On the convergence of FedAvg on Non-IID data,” in ICLR 2020. arXiv, Jun. 2020.
- R. Albert and A.-L. Barabási, “Statistical mechanics of complex networks,” Rev. Mod. Phys., vol. 74, no. 1, pp. 47–97, Jan. 2002.
- S. Savazzi, S. Kianoush, V. Rampa, and M. Bennis, “A framework for energy and carbon footprint analysis of distributed and federated edge learning,” in 2021 IEEE 32nd Annual International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC). IEEE, Sep. 2021, pp. 1564–1569.
- G. Hinton, O. Vinyals, and J. Dean, “Distilling the knowledge in a neural network,” Mar. 2015.
- A. Taya, T. Nishio, M. Morikura, and K. Yamamoto, “Decentralized and model-free federated learning: Consensus-based distillation in function space,” IEEE Trans. Signal Inf. Process. Netw., vol. 8, pp. 799–814, 2022.
- L. Yuan, F. E. H. Tay, G. Li, T. Wang, and J. Feng, “Revisiting Knowledge Distillation via Label Smoothing Regularization,” in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, WA, USA: IEEE, Jun. 2020, pp. 3902–3910.
- L. Palmieri, L. Valerio, C. Boldrini, and A. Passarella, “The effect of network topologies on fully decentralized learning: a preliminary investigation,” in Proceedings of the 1st International Workshop on Networked AI Systems, ser. NetAISys ’23, no. Article 5. New York, NY, USA: Association for Computing Machinery, Jun. 2023, pp. 1–6.
- L. Palmieri, C. Boldrini, L. Valerio, A. Passarella, and M. Conti, “Exploring the impact of disrupted peer-to-peer communications on fully decentralized learning in disaster scenarios,” in 2023 International Conference on Information and Communication Technologies for Disaster Management (ICT-DM). IEEE, Sep. 2023, pp. 1–6.
- P. Erdős, A. Rényi, and Others, “On the evolution of random graphs,” Publ. Math. Inst. Hung. Acad. Sci, vol. 5, no. 1, pp. 17–60, 1960.