Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
156 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Decentralized Distributed Optimization for Saddle Point Problems (2102.07758v7)

Published 15 Feb 2021 in math.OC and cs.DC

Abstract: We consider distributed convex-concave saddle point problems over arbitrary connected undirected networks and propose a decentralized distributed algorithm for their solution. The local functions distributed across the nodes are assumed to have global and local groups of variables. For the proposed algorithm we prove non-asymptotic convergence rate estimates with explicit dependence on the network characteristics. To supplement the convergence rate analysis, we propose lower bounds for strongly-convex-strongly-concave and convex-concave saddle-point problems over arbitrary connected undirected networks. We illustrate the considered problem setting by a particular application to distributed calculation of non-regularized Wasserstein barycenters.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (61)
  1. Y. Arjevani and O. Shamir. Communication complexity of distributed convex learning and optimization. arXiv preprint arXiv:1506.01900, 2015.
  2. W. Auzinger and J. Melenk. Iterative solution of large linear systems. Lecture notes, TU Wien, 2011.
  3. Optimization with sparsity-inducing penalties. arXiv preprint arXiv:1108.0775, 2011.
  4. Convex sparse matrix factorizations. arXiv preprint arXiv:0812.1869, 2008.
  5. Robust Optimization. Princeton University Press, 2009.
  6. A. Ben-Tal and A. Nemirovski. Lectures on modern convex optimization (2012). Online version: http://www2. isye. gatech. edu/~ nemirovs/Lect_ ModConvOpt, 2011.
  7. Iterative bregman projections for regularized transportation problems. SIAM Journal on Scientific Computing, 37(2):A1111–A1138, 2015.
  8. Local sgd for saddle-point problems. arXiv e-prints, pages arXiv–2010, 2020.
  9. Distributed saddle-point problems under data similarity. Advances in Neural Information Processing Systems, 34, 2021.
  10. Distribution’s template estimate with wasserstein metrics. Bernoulli, 21(2):740–759, 05 2015.
  11. S. Bubeck. Theory of convex optimization for machine learning. arXiv preprint arXiv:1405.4980, 15, 2014.
  12. A. Chambolle and T. Pock. A first-order primal-dual algorithm for convex problems with applications to imaging. Journal of mathematical imaging and vision, 40(1):120–145, 2011.
  13. Robust clustering tools based on optimal transportation. Statistics and Computing, 29(1):139–160, 2019.
  14. J. Delon and A. Desolneux. A wasserstein-type distance in the space of gaussian mixture models. SIAM Journal on Imaging Sciences, 13(2):936–970, 2020.
  15. D. Dvinskikh and A. Gasnikov. Decentralized and parallel primal and dual accelerated methods for stochastic convex programming problems. Journal of Inverse and Ill-posed Problems, 2021.
  16. On primal and dual approaches for distributed stochastic convex optimization over networks. In 2019 IEEE 58th Conference on Decision and Control (CDC), pages 7435–7440, 2019. arXiv:1903.09844.
  17. D. Dvinskikh and D. Tiapkin. Improved complexity bounds in wasserstein barycenter problem. In Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, pages 1738–1746. PMLR, 2021.
  18. Decentralize and randomize: Faster algorithm for wasserstein barycenters. Advances in Neural Information Processing Systems, 31:10760–10770, 2018.
  19. A general framework for a class of first order primal-dual algorithms for convex optimization in imaging science. SIAM Journal on Imaging Sciences, 3(4):1015–1046, 2010.
  20. F. Facchinei and J.-S. Pang. Finite-dimensional variational inequalities and complementarity problems. Springer Science & Business Media, 2007.
  21. A. Gasnikov. Universal gradient descent. arXiv preprint arXiv:1711.00394, 2017.
  22. A variational inequality perspective on generative adversarial networks. arXiv preprint arXiv:1802.10551, 2018.
  23. Optimal decentralized distributed algorithms for stochastic convex optimization. arXiv:1911.07363, 2019.
  24. Recent theoretical advances in decentralized distributed convex optimization. arXiv preprint arXiv:2011.13259, 2020.
  25. Fast optimal transport averaging of neuroimaging data. In International Conference on Information Processing in Medical Imaging, pages 261–272. Springer, 2015.
  26. On a combination of alternating minimization and nesterov’s momentum. In International Conference on Machine Learning, pages 3886–3898. PMLR, 2021.
  27. Fast distributed gradient methods. IEEE Transactions on Automatic Control, 59(5):1131–1146, May 2014.
  28. A direct tilde {{\{{O}}\}}(1/epsilon) iteration parallel algorithm for optimal transport. Advances in Neural Information Processing Systems, 32:11359–11370, 2019.
  29. Y. Jin and A. Sidford. Efficiently solving mdps with stochastic mirror descent. In International Conference on Machine Learning, pages 4890–4900. PMLR, 2020.
  30. T. Joachims. A support vector method for multivariate performance measures. pages 377–384, 01 2005.
  31. Optimal algorithms for decentralized stochastic variational inequalities. arXiv preprint arXiv:2202.02771, 2022.
  32. Optimal and practical algorithms for smooth and strongly convex decentralized optimization. Advances in Neural Information Processing Systems, 33:18342–18352, 2020.
  33. On the complexity of approximating Wasserstein barycenters. In K. Chaudhuri and R. Salakhutdinov, editors, Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, pages 3530–3540, Long Beach, California, USA, 09–15 Jun 2019. PMLR.
  34. G. Lan. First-order and Stochastic Optimization Methods for Machine Learning. Springer, 2020.
  35. Communication-efficient algorithms for decentralized and stochastic optimization. arXiv:1701.03961, 2017.
  36. Communication-efficient algorithms for decentralized and stochastic optimization. Mathematical Programming, 180(1):237–284, 2020.
  37. H. Li and Z. Lin. Accelerated gradient tracking over time-varying graphs for decentralized optimization. arXiv preprint arXiv:2104.02596, 2021.
  38. Fixed-support wasserstein barycenters: Computational hardness and fast algorithm. 2020.
  39. Near-optimal algorithms for minimax optimization. In Conference on Learning Theory, pages 2738–2779. PMLR, 2020.
  40. A decentralized proximal point-type method for saddle point problems. arXiv preprint arXiv:1910.14380, 2019.
  41. D. Mateos-Núnez and J. Cortés. Distributed subgradient methods for saddle-point problems. In 2015 54th IEEE Conference on Decision and Control (CDC), pages 5462–5467. IEEE, 2015.
  42. S. Mukherjee and M. Chakraborty. A decentralized algorithm for large scale min-max problems. In 2020 59th IEEE Conference on Decision and Control (CDC), pages 2967–2972. IEEE, 2020.
  43. Parallel and distributed optimization methods for estimation and control in networks. Journal of Process Control, 21(5):756–766, 2011. Special Issue on Hierarchical and Distributed Model Predictive Control.
  44. A. Nedic and A. Ozdaglar. Distributed subgradient methods for multi-agent optimization. IEEE Transactions on Automatic Control, 54(1):48–61, 2009.
  45. A. Nemirovski. Prox-method with rate of convergence o⁢(1/t)𝑜1𝑡o(1/t)italic_o ( 1 / italic_t ) for variational inequalities with lipschitz continuous monotone operators and smooth convex-concave saddle point problems. SIAM Journal on Optimization, 15(1):229–251, 2004.
  46. Y. Nesterov. Introductory Lectures on Convex Optimization: a basic course. Kluwer Academic Publishers, Massachusetts, 2004.
  47. Y. Nesterov. Smooth minimization of non-smooth functions. Mathematical Programming, 103(1):127–152, 2005.
  48. Deep decentralized multi-task multi-agent reinforcement learning under partial observability. In Proceedings of the 34th International Conference on Machine Learning (ICML), volume 70, pages 2681–2690. PMLR, 2017.
  49. Y. Ouyang and Y. Xu. Lower complexity bounds of first-order methods for convex-concave bilinear saddle-point problems. Mathematical Programming, pages 1–35, 2019.
  50. Wasserstein barycenter and its application to texture mixing. In International Conference on Scale Space and Variational Methods in Computer Vision, pages 435–446. Springer, 2011.
  51. Optimal algorithms for smooth and strongly convex distributed optimization in networks. In D. Precup and Y. W. Teh, editors, Proceedings of the 34th International Conference on Machine Learning, volume 70 of Proceedings of Machine Learning Research, pages 3027–3036, International Convention Centre, Sydney, Australia, 06–11 Aug 2017. PMLR.
  52. Optimal algorithms for non-smooth distributed optimization in networks. In Advances in Neural Information Processing Systems 31, pages 2740–2749. 2018.
  53. Convolutional Wasserstein distances: Efficient optimal transportation on geometric domains. ACM Transactions on Graphics (TOG), 34(4):66, 2015.
  54. Optimal gradient tracking for decentralized optimization. arXiv preprint arXiv:2110.05282, 2021.
  55. WASP: Scalable Bayes via barycenters of subset posteriors. In G. Lebanon and S. V. N. Vishwanathan, editors, Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, volume 38 of Proceedings of Machine Learning Research, pages 912–920, San Diego, California, USA, 09–12 May 2015. PMLR.
  56. Convergence rate of distributed optimization algorithms based on gradient tracking. arXiv preprint arXiv:1905.02637, 2019.
  57. A dual approach for optimal algorithms in distributed optimization over networks. Optimization Methods and Software, pages 1–40, 2020.
  58. Multi-agent reinforcement learning via double averaging primal-dual optimization. arXiv preprint arXiv:1806.00877, 2018.
  59. Maximum margin clustering. In L. Saul, Y. Weiss, and L. Bottou, editors, Advances in Neural Information Processing Systems, volume 17. MIT Press, 2005.
  60. On lower iteration complexity bounds for the saddle point problems. arXiv preprint arXiv:1912.07481, 2019.
  61. Primal-dual first-order methods for affinely constrained multi-block saddle point problems. arXiv preprint arXiv:2109.14212, 2021.
Citations (27)

Summary

We haven't generated a summary for this paper yet.