Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
158 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Single-Loop Stochastic Algorithms for Difference of Max-Structured Weakly Convex Functions (2405.18577v4)

Published 28 May 2024 in math.OC, cs.LG, and stat.ML

Abstract: In this paper, we study a class of non-smooth non-convex problems in the form of $\min_{x}[\max_{y\in Y}\phi(x, y) - \max_{z\in Z}\psi(x, z)]$, where both $\Phi(x) = \max_{y\in Y}\phi(x, y)$ and $\Psi(x)=\max_{z\in Z}\psi(x, z)$ are weakly convex functions, and $\phi(x, y), \psi(x, z)$ are strongly concave functions in terms of $y$ and $z$, respectively. It covers two families of problems that have been studied but are missing single-loop stochastic algorithms, i.e., difference of weakly convex functions and weakly convex strongly-concave min-max problems. We propose a stochastic Moreau envelope approximate gradient method dubbed SMAG, the first single-loop algorithm for solving these problems, and provide a state-of-the-art non-asymptotic convergence rate. The key idea of the design is to compute an approximate gradient of the Moreau envelopes of $\Phi, \Psi$ using only one step of stochastic gradient update of the primal and dual variables. Empirically, we conduct experiments on positive-unlabeled (PU) learning and partial area under ROC curve (pAUC) optimization with an adversarial fairness regularizer to validate the effectiveness of our proposed algorithms.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (50)
  1. Alternating proximal-gradient steps for (stochastic) nonconvex-concave minimax problems. SIAM J. Optim., 33:1884–1913, 2020.
  2. Fine-grained face annotation using deep multi-task cnn. Sensors, 18(8):2666, 2018.
  3. Stochastic model-based minimization of weakly convex functions, 2018.
  4. Proximally guided stochastic subgradient method for nonsmooth, nonconvex problems. SIAM Journal on Optimization, 29(3):1908–1930, 2019.
  5. Li Deng. The mnist database of handwritten digit images for machine learning research. IEEE Signal Processing Magazine, 29(6):141–142, 2012.
  6. Challenges in representation learning: A report on three machine learning contests, 2013.
  7. A novel convergence analysis for algorithms of the adam family and beyond, 2022.
  8. A stochastic subgradient method for distributionally robust non-convex and non-smooth learning. Journal of Optimization Theory and Applications, 194:1014 – 1041, 2022.
  9. Multi-block min-max bilevel optimization with applications in multi-task deep auc maximization. In S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, editors, Advances in Neural Information Processing Systems, volume 35, pages 29552–29565. Curran Associates, Inc., 2022.
  10. Non-smooth weakly-convex finite-sum coupled compositional optimization. ArXiv, abs/2310.03234, 2023.
  11. Accelerated zeroth-order momentum methods from mini to minimax optimization. ArXiv, abs/2008.08170, 2020.
  12. What is local optimality in nonconvex-nonconcave minimax optimization? In Hal Daumé III and Aarti Singh, editors, Proceedings of the 37th International Conference on Machine Learning, volume 119 of Proceedings of Machine Learning Research, pages 4880–4889. PMLR, 13–18 Jul 2020.
  13. Positive-unlabeled learning with non-negative risk estimator. ArXiv, abs/1703.00593, 2017.
  14. Learning multiple layers of features from tiny images. Citeseer, 2009.
  15. Stochastic difference-of-convex-functions algorithms for nonconvex programming. SIAM Journal on Optimization, 32(3):2263–2293, 2022.
  16. Stochastic dca for minimizing a large sum of dc functions with application to multi-class logistic regression. Neural Networks, 132:220–231, 2020.
  17. Online stochastic dca with applications to principal component analysis. IEEE Transactions on Neural Networks and Learning Systems, 35(5):7035–7047, 2024.
  18. Dc programming and dca: thirty years of developments. Mathematical Programming, 169, 01 2018.
  19. On gradient descent ascent for nonconvex-concave minimax problems. In Hal Daumé III and Aarti Singh, editors, Proceedings of the 37th International Conference on Machine Learning, volume 119 of Proceedings of Machine Learning Research, pages 6083–6093. PMLR, 13–18 Jul 2020.
  20. Stochastic auc maximization with deep neural networks. arXiv preprint arXiv:1908.10831, 2019.
  21. Stochastic recursive gradient descent ascent for stochastic nonconvex-strongly-concave minimax problems. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin, editors, Advances in Neural Information Processing Systems, volume 33, pages 20566–20577. Curran Associates, Inc., 2020.
  22. Variance-reduced accelerated methods for decentralized stochastic double-regularized nonconvex strongly-concave minimax problems. ArXiv, abs/2307.07113, 2023.
  23. J.J. Moreau. Proximité et dualité dans un espace hilbertien. Bulletin de la Société Mathématique de France, 93:273–299, 1965.
  24. Abdellatif Moudafi. A Regularization of DC Optimization. Pure and Applied Functional Analysis, 2022.
  25. Stochastic Difference of Convex Algorithm and its Application to Training Deep Boltzmann Machines. In Aarti Singh and Jerry Zhu, editors, Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, volume 54 of Proceedings of Machine Learning Research, pages 470–478. PMLR, 20–22 Apr 2017.
  26. Fair contrastive learning for facial attribute classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10389–10398, 2022.
  27. Non-convex min-max optimization: Provable algorithms and applications in machine learning. arXiv preprint arXiv:1810.02060, 2018.
  28. Weakly-convex–concave min–max optimization: provable algorithms and applications in machine learning. Optimization Methods and Software, 37(3):1087–1121, 2022.
  29. Variational Analysis. Grundlehren der mathematischen Wissenschaften. Springer Berlin Heidelberg, 2009.
  30. Algorithms for difference-of-convex programs based on difference-of-moreau-envelopes smoothing. INFORMS J. Optim., 5:321–339, 2022.
  31. Algorithms for solving a class of nonconvex optimization problems. methods of subgradients. North-holland Mathematics Studies, 129:249–271, 1986.
  32. Stochastic DCA for the large-sum of non-convex functions problem and its application to group variable selection in classification. In Doina Precup and Yee Whye Teh, editors, Proceedings of the 34th International Conference on Machine Learning, volume 70 of Proceedings of Machine Learning Research, pages 3394–3403. PMLR, 06–11 Aug 2017.
  33. Advanced graph and sequence neural networks for molecular property prediction and drug discovery. Bioinformatics, 38(9):2579–2586, 02 2022.
  34. Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. ArXiv, abs/1708.07747, 2017.
  35. Controllable invariance through adversarial feature learning. In Neural Information Processing Systems, 2017.
  36. Enhanced first and zeroth order variance reduced algorithms for min-max optimization. ArXiv, abs/2006.09361, 2020.
  37. Stochastic optimization for DC functions and non-smooth non-convex regularizers with non-asymptotic convergence. In Kamalika Chaudhuri and Ruslan Salakhutdinov, editors, Proceedings of the 36th International Conference on Machine Learning, ICML 2019, 9-15 June 2019, Long Beach, California, USA, volume 97 of Proceedings of Machine Learning Research, pages 6942–6951. PMLR, 2019.
  38. Optimal epoch stochastic gradient descent ascent methods for min-max optimization. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin, editors, Advances in Neural Information Processing Systems, volume 33, pages 5789–5800. Curran Associates, Inc., 2020.
  39. Sharp analysis of epoch stochastic gradient descent ascent methods for min-max optimization. arXiv preprint arXiv:2002.05309, 2020.
  40. Nest your adaptive algorithm for parameter-agnostic nonconvex minimax optimization. ArXiv, abs/2206.00743, 2022.
  41. Faster single-loop algorithms for minimax optimization without strong concavity. In Gustau Camps-Valls, Francisco J. R. Ruiz, and Isabel Valera, editors, Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, volume 151 of Proceedings of Machine Learning Research, pages 5485–5517. PMLR, 28–30 Mar 2022.
  42. AUC maximization in the era of big data and AI: A survey. ACM Comput. Surv., 55(8):172:1–172:37, 2023.
  43. Large-scale optimization of partial auc in a range of false positive rates, 2022.
  44. Large-scale robust deep auc maximization: A new surrogate loss and empirical studies on medical image classification, 2021.
  45. A single-loop smoothed gradient descent-ascent algorithm for nonconvex-concave min-max problems. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin, editors, Advances in Neural Information Processing Systems, volume 33, pages 7377–7389. Curran Associates, Inc., 2020.
  46. Sapd+: An accelerated stochastic method for nonconvex-concave minimax problems. In S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, editors, Advances in Neural Information Processing Systems, volume 35, pages 21668–21681. Curran Associates, Inc., 2022.
  47. Sapd+: An accelerated stochastic method for nonconvex-concave minimax problems. In Neural Information Processing Systems, 2022.
  48. Sapd+: An accelerated stochastic method for nonconvex-concave minimax problems, 2023.
  49. Renbo Zhao. A primal-dual smoothing framework for max-structured non-convex optimization, 2022.
  50. When AUC meets DRO: Optimizing partial AUC for deep learning with non-convex convergence guarantee. In Kamalika Chaudhuri, Stefanie Jegelka, Le Song, Csaba Szepesvari, Gang Niu, and Sivan Sabato, editors, Proceedings of the 39th International Conference on Machine Learning, volume 162 of Proceedings of Machine Learning Research, pages 27548–27573. PMLR, 17–23 Jul 2022.
Citations (1)

Summary

We haven't generated a summary for this paper yet.