Energy-efficient Decentralized Learning via Graph Sparsification
Abstract: This work aims at improving the energy efficiency of decentralized learning by optimizing the mixing matrix, which controls the communication demands during the learning process. Through rigorous analysis based on a state-of-the-art decentralized learning algorithm, the problem is formulated as a bi-level optimization, with the lower level solved by graph sparsification. A solution with guaranteed performance is proposed for the special case of fully-connected base topology and a greedy heuristic is proposed for the general case. Simulations based on real topology and dataset show that the proposed solution can lower the energy consumption at the busiest node by 54%-76% while maintaining the quality of the trained model.
- “Communication-efficient learning of deep networks from decentralized data,” in AISTATS, 2017.
- Peter Kairouz et al., Advances and Open Problems in Federated Learning, Now Foundations and Trends, 2021.
- “Decentralized deep learning with arbitrary communication compression,” in The International Conference on Learning Representations (ICLR), 2020.
- “Adaptive federated learning in resource constrained edge computing systems,” IEEE Journal on Selected Areas in Communications, vol. 37, no. 6, pp. 1205–1221, 2019.
- “Can decentralized algorithms outperform centralized algorithms? a case study for decentralized parallel stochastic gradient descent,” in Proceedings of the 31st International Conference on Neural Information Processing Systems, 2017, p. 5336–5346.
- “Laplacian matrix sampling for communication- efficient decentralized learning,” IEEE Journal on Selected Areas in Communications, vol. 41, no. 4, pp. 887–901, 2023.
- “MATCHA: Speeding up decentralized SGD via matching decomposition sampling,” in NeurIPS Workshop on Federated Learning, 2019.
- “Optimal complexity in decentralized training,” in International Conference on Machine Learning (ICML), 2021.
- “SPARQ-SGD: Event-triggered and compressed communication in decentralized optimization,” in IEEE CDC, 2020.
- “SQuARM-SGD: Communication-efficient momentum SGD for decentralized optimization,” IEEE Journal on Selected Areas in Information Theory, vol. 2, no. 3, pp. 954–969, 2021.
- “Fast linear iterations for distributed averaging,” Systems & Control Letters, vol. 53, pp. 65–78, September 2004.
- “Randomized gossip algorithms,” in IEEE Transactions on Information Theory, 2006, vol. 52.
- “Graph diameter, eigenvalues, and minimum-time consensus,” Automatica, pp. 635–640, 2014.
- “The role of network topology for distributed machine learning,” in IEEE INFOCOM, 2019.
- “Throughput-optimal topology design for cross-silo federated learning,” in Proceedings of the 34th International Conference on Neural Information Processing Systems, Red Hook, NY, USA, 2020, NIPS’20, Curran Associates Inc.
- “Refined convergence and topology learning for decentralized sgd with heterogeneous data,” in Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, Francisco Ruiz, Jennifer Dy, and Jan-Willem van de Meent, Eds. 25–27 Apr 2023, vol. 206 of Proceedings of Machine Learning Research, pp. 1672–1702, PMLR.
- “Beyond spectral gap: the role of the topology in decentralized learning,” in Advances in Neural Information Processing Systems, S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, Eds. 2022, vol. 35, pp. 15039–15050, Curran Associates, Inc.
- “A unified theory of decentralized SGD with changing topology and local updates,” in ICML, 2020.
- Béla Bollobás, Modern Graph Theory, Graduate texts in mathematics. Springer, 2013.
- “A faster interior point method for semidefinite programming,” in 2020 IEEE 61st Annual Symposium on Foundations of Computer Science (FOCS), 2020, pp. 910–918.
- “Graph sparsification by effective resistances,” in ACM STOC, 2008.
- “Expander graphs and their applications,” Bull. Amer. Math. Soc., vol. 43, no. 04, pp. 439–562, Aug. 2006.
- Joel Friedman, “Relative expanders or weakly relatively Ramanujan graphs,” Duke Mathematical Journal, vol. 118, no. 1, pp. 19 – 35, 2003.
- “Generating random regular graphs,” in Proceedings of the Thirty-Fifth Annual ACM Symposium on Theory of Computing, New York, NY, USA, 2003, STOC ’03, p. 213–222, Association for Computing Machinery.
- “Link-level measurements from an 802.11b mesh network,” in SIGCOMM, 2004.
- “A first look into the carbon footprint of federated learning,” 2021.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.