Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Decentralized Optimization in Networks with Arbitrary Delays (2401.11344v1)

Published 20 Jan 2024 in math.OC, cs.MA, cs.SY, eess.SP, and eess.SY

Abstract: We consider the problem of decentralized optimization in networks with communication delays. To accommodate delays, we need decentralized optimization algorithms that work on directed graphs. Existing approaches require nodes to know their out-degree to achieve convergence. We propose a novel gossip-based algorithm that circumvents this requirement, allowing decentralized optimization in networks with communication delays. We prove that our algorithm converges on non-convex objectives, with the same main complexity order term as centralized Stochastic Gradient Descent (SGD), and show that the graph topology and the delays only affect the higher order terms. We provide numerical simulations that illustrate our theoretical results.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (18)
  1. J. N. Tsitsiklis, “Problems in decentralized decision making and computation,” Ph.D. dissertation, Massachusetts Institute of Technology, 1984.
  2. D. Kempe, A. Dobra, and J. Gehrke, “Gossip-based computation of aggregate information,” in 44th Annual IEEE Symposium on Foundations of Computer Science, 2003. Proceedings.   Cambridge, MA, USA: IEEE Computer. Soc, 2003, pp. 482–491. [Online]. Available: http://ieeexplore.ieee.org/document/1238221/
  3. S. Boyd, A. Ghosh, B. Prabhakar, and D. Shah, “Randomized gossip algorithms,” IEEE Transactions on Information Theory, vol. 52, no. 6, pp. 2508–2530, Jun. 2006. [Online]. Available: http://ieeexplore.ieee.org/document/1638541/
  4. A. Nedic and A. Ozdaglar, “Distributed subgradient methods for multi-agent optimization,” IEEE Transactions on Automatic Control, vol. 54, no. 1, pp. 48–61, Jan 2009.
  5. E. Wei and A. Ozdaglar, “Distributed alternating direction method of multipliers,” in 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).   Maui, HI, USA: IEEE, Dec. 2012, pp. 5445–5450. [Online]. Available: http://ieeexplore.ieee.org/document/6425904/
  6. K. Srivastava and A. Nedic, “Distributed asynchronous constrained stochastic optimization,” IEEE Journal of Selected Topics in Signal Processing, vol. 5, no. 4, pp. 772–790, Aug. 2011. [Online]. Available: https://ieeexplore.ieee.org/abstract/document/5719290
  7. A. Nedić, A. Olshevsky, and W. Shi, “Achieving geometric convergence for distributed optimization over time-varying graphs,” SIAM Journal on Optimization, vol. 27, no. 4, pp. 2597–2633, Jan. 2017. [Online]. Available: https://epubs.siam.org/doi/10.1137/16M1084316
  8. A. Koloskova, N. Loizou, S. Boreiri, M. Jaggi, and S. Stich, “A unified theory of decentralized sgd with changing topology and local updates,” in Proceedings of the 37th International Conference on Machine Learning.   PMLR, Nov 2020, pp. 5381–5393. [Online]. Available: https://proceedings.mlr.press/v119/koloskova20a.html
  9. A. Koloskova, S. Stich, and M. Jaggi, “Decentralized stochastic optimization and gossip algorithms with compressed communication,” in Proceedings of the 36th International Conference on Machine Learning.   PMLR, May 2019, pp. 3478–3487. [Online]. Available: https://proceedings.mlr.press/v97/koloskova19a.html
  10. T. Ortega and H. Jafarkhani, “Gossiped and quantized online multi-kernel learning,” IEEE Signal Processing Letters, vol. 30, pp. 468–472, 2023.
  11. A. Nedić, A. Olshevsky, and M. G. Rabbat, “Network topology and communication-computation tradeoffs in decentralized optimization,” Proceedings of the IEEE, vol. 106, no. 5, pp. 953–976, May 2018. [Online]. Available: https://ieeexplore.ieee.org/document/8340193
  12. F. Benezit, V. Blondel, P. Thiran, J. Tsitsiklis, and M. Vetterli, “Weighted gossip: Distributed averaging using non-doubly stochastic matrices,” in 2010 IEEE International Symposium on Information Theory.   Austin, TX, USA: IEEE, Jun. 2010, pp. 1753–1757. [Online]. Available: http://ieeexplore.ieee.org/document/5513273/
  13. M. Assran, N. Loizou, N. Ballas, and M. Rabbat, “Stochastic gradient push for distributed deep learning,” in Proceedings of the 36th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, K. Chaudhuri and R. Salakhutdinov, Eds., vol. 97.   PMLR, 09–15 Jun 2019, pp. 344–353. [Online]. Available: https://proceedings.mlr.press/v97/assran19a.html
  14. B. M. Assran, A. Aytekin, H. R. Feyzmahdavian, M. Johansson, and M. G. Rabbat, “Advances in asynchronous parallel and distributed optimization,” Proceedings of the IEEE, vol. 108, no. 11, pp. 2013–2031, Nov. 2020. [Online]. Available: https://ieeexplore.ieee.org/document/9217472/
  15. A. Nedić and A. Ozdaglar, “Convergence rate for consensus with delays,” Journal of Global Optimization, vol. 47, no. 3, pp. 437–456, Jul 2010. [Online]. Available: https://doi.org/10.1007/s10898-008-9370-2
  16. J. S. Rosenthal, “Convergence rates for markov chains,” SIAM Review, vol. 37, no. 3, pp. 387–405, 1995. [Online]. Available: http://www.jstor.org/stable/2132659
  17. A. Koloskova, T. Lin, S. U. Stich, and M. Jaggi, “Decentralized deep learning with arbitrary communication compression,” in International Conference on Learning Representations, 2019.
  18. T. Ortega. (2023, Oct.) DT-GO. GitHub. [Online]. Available: https://github.com/TomasOrtega/DT-GO
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Tomas Ortega (32 papers)
  2. Hamid Jafarkhani (76 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.