Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 94 tok/s
Gemini 2.5 Pro 54 tok/s Pro
GPT-5 Medium 31 tok/s Pro
GPT-5 High 38 tok/s Pro
GPT-4o 102 tok/s Pro
Kimi K2 207 tok/s Pro
GPT OSS 120B 467 tok/s Pro
Claude Sonnet 4 37 tok/s Pro
2000 character limit reached

Distributed Riemannian Stochastic Gradient Tracking Algorithm on the Stiefel Manifold (2405.16900v2)

Published 27 May 2024 in math.OC

Abstract: This paper focus on investigating the distributed Riemannian stochastic optimization problem on the Stiefel manifold for multi-agent systems, where all the agents work collaboratively to optimize a function modeled by the average of their expectation-valued local costs. Each agent only processes its own local cost function and communicate with neighboring agents to achieve optimal results while ensuring consensus. Since the local Riemannian gradient in stochastic regimes cannot be directly calculated, we will estimate the gradient by the average of a variable number of sampled gradient, which however brings about noise to the system. We then propose a distributed Riemannian stochastic optimization algorithm on the Stiefel manifold by combining the variable sample size gradient approximation method with the gradient tracking dynamic. It is worth noticing that the suitably chosen increasing sample size plays an important role in improving the algorithm efficiency, as it reduces the noise variance. In an expectation-valued sense, the iterates of all agents are proved to converge to a stationary point (or neighborhood) with fixed step sizes. We further establish the convergence rate of the iterates for the cases when the sample size is exponentially increasing, polynomial increasing, or a constant, respectively. Finally, numerical experiments are implemented to demonstrate the theoretical results.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (37)
  1. Optimization Algorithms on Matrix Manifolds. Princeton University Press, Princeton, 2008.
  2. Distributed coupled multiagent stochastic optimization. IEEE Transactions on Automatic Control, 65(1):175–190, 2019.
  3. Silvère Bonnabel. Stochastic gradient descent on Riemannian manifolds. IEEE Transactions on Automatic Control, 58(9):2217–2229, 2013.
  4. Optimization methods for large-scale machine learning. SIAM review, 60(2):223–311, 2018.
  5. Cartis Coralia Boumal Nicolas, Absil P-A. Global rates of convergence for nonconvex optimization on manifolds. IMA Journal of Numerical Analysis, 39(1):1–33, 2018.
  6. Decentralized Riemannian conjugate gradient method on the Stiefel manifold. arXiv preprint arXiv:2308.10547, 2024.
  7. Decentralized Riemannian gradient descent on the Stiefel manifold. In Marina Meila and Tong Zhang, editors, Proceedings of the 38th International Conference on Machine Learning, volume 139 of Proceedings of Machine Learning Research, pages 1594–1605. PMLR, 18–24 Jul 2021.
  8. On the local linear rate of consensus on the Stiefel manifold, 2021.
  9. Decentralized projected Riemannian gradient method for smooth optimization on compact submanifolds. arXiv preprint arXiv:2304.08241, 2023.
  10. The geometry of algorithms with orthogonality constraints. SIAM Journal on Matrix Analysis and Applications, 20(2):303–353, 1998.
  11. Communication-efficient distributed pca by Riemannian optimization. In International Conference on Machine Learning, pages 4465–4474. PMLR, 2020.
  12. A decentralized algorithm for spectral analysis. In Proceedings of the thirty-sixth annual ACM symposium on Theory of computing, pages 561–568, 2004.
  13. Deep learning. Nature, 521(7553):436–444, 2015.
  14. Distributed variable sample-size stochastic optimization with fixed step-sizes. IEEE Transactions on Automatic Control, 67(10):5630–5637, 2022.
  15. Perturbation bounds of unitary and subunitary polar factors. SIAM Journal on Matrix Analysis and Applications, 23(4):1183–1193, 2002.
  16. Weakly convex optimization over Stiefel manifold using Riemannian subgradient-type methods. SIAM Journal on Optimization, 31(3):1605–1634, 2021.
  17. Quadratic optimization with orthogonality constraint: explicit łojasiewicz exponent and linear convergence of retraction-based line-search and stochastic variance-reduced gradient methods. Mathematical Programming, 178:215–262, 2019.
  18. High-dimensional Kuramoto models on Stiefel manifolds synchronize complex networks almost globally. Automatica, 113:108736, 2020.
  19. Robust stochastic approximation approach to stochastic programming. SIAM Journal on Optimization, 19(4):1574–1609, 2009.
  20. Convergence speed in distributed consensus and averaging. SIAM Journal on Control and Optimization, 48(1):33–55, 2009.
  21. The perron-frobenius theorem: some of its applications. IEEE Signal Processing Magazine, 22(2):62–75, 2005.
  22. Distributed stochastic gradient tracking methods. Mathematical Programming, 187:409–457, 2021.
  23. Guannan Qu and Na Li. Harnessing smoothness to accelerate distributed optimization. IEEE Transactions on Control of Network Systems, 5(3):1245–1260, 2017.
  24. Cloud k-svd: A collaborative dictionary learning algorithm for big, distributed data. IEEE Transactions on Signal Processing, 64(1):173–188, 2015.
  25. Cloud k-svd: A collaborative dictionary learning algorithm for big, distributed data. IEEE Transactions on Signal Processing, 64(1):173–188, 2016.
  26. A stochastic approximation method. The Annals of Mathematical Statistics, pages 400–407, 1951.
  27. Consensus optimization on manifolds. SIAM Journal on Control and Optimization, 48(1):56–76, 2009.
  28. Ali H Sayed et al. Adaptation, learning, and optimization over networks. Foundations and Trends® in Machine Learning, 7(4-5):311–801, 2014.
  29. SUHAIL MOHMAD SHAH. Distributed optimization on Riemannian manifolds. arXiv preprint arXiv:1711.11196, 2017.
  30. Distributed asynchronous constrained stochastic optimization. IEEE Journal of Selected Topics in Signal Processing, 5(4):772–790, 2011.
  31. Pymanopt: A python toolbox for optimization on manifolds using automatic differentiation. arXiv preprint arXiv:1603.03236, 2016.
  32. On orthogonality and learning recurrent networks with long term dependencies. In International Conference on Machine Learning, pages 3570–3578. PMLR, 2017.
  33. Decentralized optimization over the Stiefel manifold by an approximate augmented lagrangian function. IEEE Transactions on Signal Processing, 70:3029–3041, 2022.
  34. A variance-reduced stochastic gradient tracking algorithm for decentralized optimization with orthogonality constraints. arXiv preprint arXiv:2208.13643, 2022.
  35. Augmented distributed gradient methods for multi-agent optimization under uncoordinated constant stepsizes. In 2015 54th IEEE Conference on Decision and Control (CDC), pages 2055–2060, 2015.
  36. Bluefog: Make decentralized algorithms practical for optimization and deep learning. 2021.
  37. First-order methods for geodesically convex optimization. In Conference on Learning Theory, pages 1617–1638. PMLR, 2016.

Summary

We haven't generated a summary for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Lightbulb On Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube