Papers
Topics
Authors
Recent
Search
2000 character limit reached

Energy-efficient Decentralized Learning via Graph Sparsification

Published 5 Jan 2024 in cs.LG, cs.DC, and math.OC | (2401.03083v2)

Abstract: This work aims at improving the energy efficiency of decentralized learning by optimizing the mixing matrix, which controls the communication demands during the learning process. Through rigorous analysis based on a state-of-the-art decentralized learning algorithm, the problem is formulated as a bi-level optimization, with the lower level solved by graph sparsification. A solution with guaranteed performance is proposed for the special case of fully-connected base topology and a greedy heuristic is proposed for the general case. Simulations based on real topology and dataset show that the proposed solution can lower the energy consumption at the busiest node by 54%-76% while maintaining the quality of the trained model.

Authors (3)
Definition Search Book Streamline Icon: https://streamlinehq.com
References (26)
  1. “Communication-efficient learning of deep networks from decentralized data,” in AISTATS, 2017.
  2. Peter Kairouz et al., Advances and Open Problems in Federated Learning, Now Foundations and Trends, 2021.
  3. “Decentralized deep learning with arbitrary communication compression,” in The International Conference on Learning Representations (ICLR), 2020.
  4. “Adaptive federated learning in resource constrained edge computing systems,” IEEE Journal on Selected Areas in Communications, vol. 37, no. 6, pp. 1205–1221, 2019.
  5. “Can decentralized algorithms outperform centralized algorithms? a case study for decentralized parallel stochastic gradient descent,” in Proceedings of the 31st International Conference on Neural Information Processing Systems, 2017, p. 5336–5346.
  6. “Laplacian matrix sampling for communication- efficient decentralized learning,” IEEE Journal on Selected Areas in Communications, vol. 41, no. 4, pp. 887–901, 2023.
  7. “MATCHA: Speeding up decentralized SGD via matching decomposition sampling,” in NeurIPS Workshop on Federated Learning, 2019.
  8. “Optimal complexity in decentralized training,” in International Conference on Machine Learning (ICML), 2021.
  9. “SPARQ-SGD: Event-triggered and compressed communication in decentralized optimization,” in IEEE CDC, 2020.
  10. “SQuARM-SGD: Communication-efficient momentum SGD for decentralized optimization,” IEEE Journal on Selected Areas in Information Theory, vol. 2, no. 3, pp. 954–969, 2021.
  11. “Fast linear iterations for distributed averaging,” Systems & Control Letters, vol. 53, pp. 65–78, September 2004.
  12. “Randomized gossip algorithms,” in IEEE Transactions on Information Theory, 2006, vol. 52.
  13. “Graph diameter, eigenvalues, and minimum-time consensus,” Automatica, pp. 635–640, 2014.
  14. “The role of network topology for distributed machine learning,” in IEEE INFOCOM, 2019.
  15. “Throughput-optimal topology design for cross-silo federated learning,” in Proceedings of the 34th International Conference on Neural Information Processing Systems, Red Hook, NY, USA, 2020, NIPS’20, Curran Associates Inc.
  16. “Refined convergence and topology learning for decentralized sgd with heterogeneous data,” in Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, Francisco Ruiz, Jennifer Dy, and Jan-Willem van de Meent, Eds. 25–27 Apr 2023, vol. 206 of Proceedings of Machine Learning Research, pp. 1672–1702, PMLR.
  17. “Beyond spectral gap: the role of the topology in decentralized learning,” in Advances in Neural Information Processing Systems, S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, Eds. 2022, vol. 35, pp. 15039–15050, Curran Associates, Inc.
  18. “A unified theory of decentralized SGD with changing topology and local updates,” in ICML, 2020.
  19. Béla Bollobás, Modern Graph Theory, Graduate texts in mathematics. Springer, 2013.
  20. “A faster interior point method for semidefinite programming,” in 2020 IEEE 61st Annual Symposium on Foundations of Computer Science (FOCS), 2020, pp. 910–918.
  21. “Graph sparsification by effective resistances,” in ACM STOC, 2008.
  22. “Expander graphs and their applications,” Bull. Amer. Math. Soc., vol. 43, no. 04, pp. 439–562, Aug. 2006.
  23. Joel Friedman, “Relative expanders or weakly relatively Ramanujan graphs,” Duke Mathematical Journal, vol. 118, no. 1, pp. 19 – 35, 2003.
  24. “Generating random regular graphs,” in Proceedings of the Thirty-Fifth Annual ACM Symposium on Theory of Computing, New York, NY, USA, 2003, STOC ’03, p. 213–222, Association for Computing Machinery.
  25. “Link-level measurements from an 802.11b mesh network,” in SIGCOMM, 2004.
  26. “A first look into the carbon footprint of federated learning,” 2021.
Citations (2)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 1 like about this paper.