Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
129 tokens/sec
GPT-4o
28 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Linearization Algorithms for Fully Composite Optimization (2302.12808v2)

Published 24 Feb 2023 in math.OC and cs.LG

Abstract: This paper studies first-order algorithms for solving fully composite optimization problems over convex and compact sets. We leverage the structure of the objective by handling its differentiable and non-differentiable components separately, linearizing only the smooth parts. This provides us with new generalizations of the classical Frank-Wolfe method and the Conditional Gradient Sliding algorithm, that cater to a subclass of non-differentiable problems. Our algorithms rely on a stronger version of the linear minimization oracle, which can be efficiently implemented in several practical applications. We provide the basic version of our method with an affine-invariant analysis and prove global convergence rates for both convex and non-convex objectives. Furthermore, in the convex case, we propose an accelerated method with correspondingly improved complexity. Finally, we provide illustrative experiments to support our theoretical results.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (56)
  1. Hybrid conditional gradient-smoothing algorithms with applications to sparse and low rank regularization. Regularization, Optimization, Kernels, and Support Vector Machines, pages 53–82, 2014.
  2. A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM journal on imaging sciences, 2(1):183–202, 2009.
  3. The multiproximal linearization method for convex composite problems. Mathematical Programming, 182(1):1–36, 2020.
  4. New constraint qualification and conjugate duality for composed convex optimization problems. Journal of Optimization Theory and Applications, 135:241–255, 2007.
  5. A new constraint qualification for the formula of the subdifferential of composed convex functions in infinite dimensional spaces. Mathematische Nachrichten, 281(8):1088–1107, 2008.
  6. Convex optimization. Cambridge university press, 2004.
  7. Conditional gradient methods. arXiv preprint arXiv:2211.14103, 2022.
  8. James V Burke. Descent methods for composite nondifferentiable optimization problems. Mathematical Programming, 33(3):260–279, 1985.
  9. James V Burke. Second order necessary and sufficient conditions for convex composite ndo. Mathematical programming, 38:287–302, 1987.
  10. A Gauss—Newton method for convex composite optimization. Mathematical Programming, 71(2):179–194, 1995.
  11. A study of convex convex-composite functions via infimal convolution with applications. Mathematics of Operations Research, 46(4):1324–1348, 2021.
  12. Parameter-free locally accelerated conditional gradients. arXiv preprint arXiv:2102.06806, 2021.
  13. Accelerating frank-wolfe via averaging step directions. arXiv preprint arXiv:2205.11794, 2022.
  14. Boosting frank-wolfe by chasing gradients. In International Conference on Machine Learning, pages 2111–2121. PMLR, 2020.
  15. Complexity of linear minimization and projection on some sets. Operations Research Letters, 2021.
  16. Composite difference-max programs for modern statistical estimation problems. SIAM Journal on Optimization, 28(4):3344–3374, 2018.
  17. Multi-objective bayesian optimization over high-dimensional search spaces. In Uncertainty in Artificial Intelligence, pages 507–517. PMLR, 2022.
  18. Welington De Oliveira. Short paper-a note on the frank–wolfe algorithm for a class of nonconvex and nonsmooth optimization problems. Open Journal of Mathematical Optimization, 4:1–10, 2023.
  19. Olivier Devolder. Exactness, inexactness and stochasticity in first-order methods for large-scale convex optimization. PhD thesis, ICTEAM and CORE, Université catholique de Louvain, 2013.
  20. Locally accelerated conditional gradients. In International Conference on Artificial Intelligence and Statistics, pages 1737–1747. PMLR, 2020.
  21. CVXPY: A Python-embedded modeling language for convex optimization. Journal of Machine Learning Research, 17(83):1–5, 2016.
  22. High-order optimization methods for fully composite problems. SIAM Journal on Optimization, 32(3):2402–2427, 2022.
  23. Error bounds, quadratic growth, and linear convergence of proximal methods. Mathematics of Operations Research, 43(3):919–948, 2018.
  24. Efficiency of minimizing compositions of convex functions and smooth maps. Mathematical Programming, 178(1):503–558, 2019.
  25. Solving (most) of a set of quadratic equalities: Composite optimization for robust phase retrieval. Information and Inference: A Journal of the IMA, 8(3):471–529, 2019.
  26. An algorithm for quadratic programming. Naval research logistics quarterly, 3(1-2):95–110, 1956.
  27. A linearly convergent variant of the conditional gradient algorithm under strong convexity, with applications to online and stochastic optimization. SIAM Journal on Optimization, 26(3):1493–1528, 2016.
  28. Donald W. Hearn. The gap function of a convex program. Operations Research Letters, 1(2):67–71, apr 1982. 10.1016/0167-6377(82)90049-9.
  29. Martin Jaggi. Revisiting Frank-Wolfe: Projection-free sparse convex optimization. In International Conference on Machine Learning, pages 427–435, 2013.
  30. On a frank-wolfe approach for abs-smooth functions. arXiv preprint arXiv:2303.09881, 2023.
  31. Simon Lacoste-Julien. Convergence rate of Frank-Wolfe for non-convex objectives. arXiv preprint arXiv:1607.00345, 2016.
  32. Guanghui Lan. The complexity of large-scale convex programming under a linear optimization oracle. arXiv preprint arXiv:1309.5550, 2013.
  33. Guanghui Lan and Yi Zhou. Conditional gradient sliding for convex optimization. SIAM Journal on Optimization, 26(2):1379–1409, 2016.
  34. Claude Lemaréchal. Cauchy and the gradient method. Doc Math Extra, 251(254):10, 2012.
  35. Francesco Mezzadri. How to generate random matrices from the classical compact groups. arXiv preprint math-ph/0609050, 2006.
  36. Kaisa Miettinen. Nonlinear multiobjective optimization, volume 12. Springer Science & Business Media, 1999.
  37. Arkadi Nemirovski. Information-based complexity of convex programming. Lecture notes, 834, 1995.
  38. Problem complexity and method efficiency in optimization. 1983.
  39. Yurii Nesterov. A method for solving the convex programming problem with convergence rate O(1/k^2). In Dokl. akad. nauk Sssr, volume 269, pages 543–547, 1983.
  40. Yurii Nesterov. Effective methods in nonlinear programming. Moscow, Radio i Svyaz, 1989.
  41. Yurii Nesterov. Modified Gauss–Newton scheme with worst case guarantees for global performance. Optimisation Methods and Software, 22(3):469–483, 2007.
  42. Yurii Nesterov. Gradient methods for minimizing composite functions. Mathematical Programming, 140(1):125–161, 2013.
  43. Yurii Nesterov. Complexity bounds for primal-dual methods minimizing the model of objective function. Mathematical Programming, 171(1):311–330, 2018a.
  44. Yurii Nesterov. Lectures on convex optimization, volume 137. Springer, 2018b.
  45. Interior-point polynomial algorithms in convex programming. SIAM, 1994.
  46. Teemu Pennanen. Graph-convex mappings and k-convex functions. Journal of Convex Analysis, 6(2):235–266, 1999.
  47. Non-convex conditional gradient sliding. In international conference on machine learning, pages 4208–4217. PMLR, 2018.
  48. A deterministic nonsmooth frank wolfe algorithm with coreset guarantees. Informs Journal on Optimization, 1(2):120–142, 2019.
  49. R Tyrrell Rockafellar. Convex analysis, volume 36. Princeton university press, 1970.
  50. Minimization methods for non-differentiable functions, 1985.
  51. Projection efficient subgradient method and optimal nonsmooth frank-wolfe method. Advances in Neural Information Processing Systems, 33:12211–12224, 2020.
  52. Stochastic Gauss-Newton algorithms for nonconvex compositional optimization. In International Conference on Machine Learning, pages 9572–9582. PMLR, 2020.
  53. A conditional gradient framework for composite convex minimization with applications to semidefinite programming. In International Conference on Machine Learning, pages 5727–5736. PMLR, 2018.
  54. Conditional gradient methods via stochastic path-integrated differential estimator. In International Conference on Machine Learning, pages 7282–7291. PMLR, 2019.
  55. Random hypervolume scalarizations for provable multi-objective black box optimization. In International Conference on Machine Learning, pages 11096–11105. PMLR, 2020.
  56. Analysis of the frank–wolfe method for convex composite optimization involving a logarithmically-homogeneous barrier. Mathematical Programming, pages 1–41, 2022.

Summary

We haven't generated a summary for this paper yet.