Perseus: A Simple and Optimal High-Order Method for Variational Inequalities (2205.03202v7)
Abstract: This paper settles an open and challenging question pertaining to the design of simple and optimal high-order methods for solving smooth and monotone variational inequalities (VIs). A VI involves finding $x\star \in \mathcal{X}$ such that $\langle F(x), x - x\star\rangle \geq 0$ for all $x \in \mathcal{X}$. We consider the setting in which $F$ is smooth with up to $(p-1){th}$-order derivatives. For $p = 2$, the cubic regularized Newton method was extended to VIs with a global rate of $O(\epsilon{-1})$. An improved rate of $O(\epsilon{-2/3}\log\log(1/\epsilon))$ can be obtained via an alternative second-order method, but this method requires a nontrivial line-search procedure as an inner loop. Similarly, high-order methods based on line-search procedures have been shown to achieve a rate of $O(\epsilon{-2/(p+1)}\log\log(1/\epsilon))$. As emphasized by Nesterov, however, such procedures do not necessarily imply practical applicability in large-scale applications, and it would be desirable to complement these results with a simple high-order VI method that retains the optimality of the more complex methods. We propose a $p{th}$-order method that does \textit{not} require any line search procedure and provably converges to a weak solution at a rate of $O(\epsilon{-2/(p+1)})$. We prove that our $p{th}$-order method is optimal in the monotone setting by establishing a matching lower bound under a generalized linear span assumption. Our method with restarting attains a linear rate for smooth and uniformly monotone VIs and a local superlinear rate for smooth and strongly monotone VIs. Our method also achieves a global rate of $O(\epsilon{-2/p})$ for solving smooth and nonmonotone VIs satisfying the Minty condition and when augmented with restarting it attains a global linear and local superlinear rate for smooth and nonmonotone VIs satisfying the uniform/strong Minty condition.
- Optimal methods for higher-order smooth monotone variational inequalities. ArXiv Preprint: 2205.06167, 2022.
- A. S. Antipin. Method of convex programming using a symmetric modification of Lagrange function. Matekon, 14(2):23–38, 1978.
- Oracle complexity of second-order methods for smooth convex optimization. Mathematical Programming, 178(1):327–360, 2019.
- M. Baes. Estimate sequence methods: extensions and approximations. Institute for Operations Research, ETH, Zürich, Switzerland, 2009.
- Convex Analysis and Monotone Operator Theory in Hilbert Spaces. Springer, 2017.
- Evaluation complexity for nonlinear constrained optimization using unscaled KKT conditions and high-order models. SIAM Journal on Optimization, 26(2):951–967, 2016.
- Worst-case evaluation complexity for unconstrained nonlinear optimization using high-order regularized models. Mathematical Programming, 163(1-2):359–368, 2017.
- L. Brighi and R. John. Characterizations of pseudomonotone maps and economic equilibrium. Journal of Statistics and Management Systems, 5(1-3):253–273, 2002.
- B. Bullins. Highly smooth minimization of nonsmooth problems. In COLT, pages 988–1030. PMLR, 2020.
- B. Bullins and K. A. Lai. Higher-order methods for convex-concave min-max optimization and monotone variational inequalities. SIAM Journal on Optimization, 32(3):2208–2229, 2022.
- Y. Carmon and J. Duchi. Gradient descent finds the cubic-regularized nonconvex Newton step. SIAM Journal on Optimization, 29(3):2146–2178, 2019.
- Lower bounds for finding stationary points I. Mathematical Programming, 184(1-2):71–120, 2020.
- Optimal and adaptive Monteiro-Svaiter acceleration. In NeurIPS, pages 20338–20350, 2022.
- On the complexity of steepest descent, Newton’s and regularized Newton’s methods for nonconvex unconstrained optimization problems. SIAM Journal on Optimization, 20(6):2833–2852, 2010.
- Adaptive cubic regularisation methods for unconstrained optimization. Part I: motivation, convergence and numerical results. Mathematical Programming, 127(2):245–295, 2011a.
- Adaptive cubic regularisation methods for unconstrained optimization. Part II: worst-case function-and derivative-evaluation complexity. Mathematical Programming, 130(2):295–319, 2011b.
- Universal regularization methods: Varying the power, the smoothness and the accuracy. SIAM Journal on Optimization, 29(1):595–615, 2019.
- Evaluation Complexity of Algorithms for Nonconvex Optimization: Theory, Computation and Perspectives. SIAM, 2022.
- N. Cesa-Bianchi and G. Lugosi. Prediction, Learning, and Games. Cambridge University Press, 2006.
- Accelerated schemes for a class of variational inequalities. Mathematical Programming, 165(1):113–149, 2017.
- Product positioning under price competition. Management Science, 36(2):175–199, 1990.
- Variational Inequalities and Complementarity Problems: Theory and Applications. John Wiley & Sons, 1980.
- C. D. Dang and G. Lan. On the convergence properties of non-Euclidean extragradient methods for variational inequalities with generalized monotone operators. Computational Optimization and Applications, 60(2):277–310, 2015.
- The complexity of constrained min-max optimization. In STOC, pages 1466–1478, 2021.
- J. Diakonikolas. Halpern iteration for near-optimal and parameter-free monotone inclusion and strong solutions to variational inequalities. In COLT, pages 1428–1451. PMLR, 2020.
- Efficient methods for structured nonconvex-nonconcave min-max optimization. In AISTATS, pages 2746–2754. PMLR, 2021.
- N. Doikov and Y. Nesterov. Local convergence of tensor methods. Mathematical Programming, 193(1):315–336, 2022.
- C. Ewerhart. Cournot games with biconcave demand. Games and Economic Behavior, 85:37–47, 2014.
- F. Facchinei and J-S. Pang. Finite-Dimensional Variational Inequalities and Complementarity Problems. Springer Science & Business Media, 2007.
- O. Fercoq and Z. Qu. Adaptive restart of accelerated gradient methods under local quadratic growth condition. IMA Journal of Numerical Analysis, 39(4):2069–2095, 2019.
- R. M. Freund and H. Lu. New computational guarantees for solving convex optimization problems with first order methods, via a function growth condition measure. Mathematical Programming, 170(2):445–477, 2018.
- M. Fukushima. Equivalent differentiable optimization problems and descent methods for asymmetric variational inequality problems. Mathematical Programming, 53:99–110, 1992.
- G. Gallego and M. Hu. Dynamic pricing of perishable assets under competition. Management Science, 60(5):1241–1259, 2014.
- Near optimal methods for minimizing convex functions with Lipschitz p𝑝pitalic_p-th derivatives. In COLT, pages 1392–1393. PMLR, 2019.
- S. Ghadimi and G. Lan. Optimal stochastic approximation algorithms for strongly convex stochastic composite optimization, II: shrinking procedures and optimal algorithms. SIAM Journal on Optimization, 23(4):2061–2089, 2013.
- P. Giselsson and S. Boyd. Monotonicity and restart in fast gradient methods. In CDC, pages 5058–5063. IEEE, 2014.
- Generative adversarial nets. In NIPS, pages 2672–2680, 2014.
- Solving the trust-region subproblem using the Lanczos method. SIAM Journal on Optimization, 9(2):504–525, 1999.
- On solving trust-region and other regularised subproblems in optimization. Mathematical Programming Computation, 2(1):21–57, 2010.
- G. Grapiglia and Y. Nesterov. Regularized Newton methods for minimizing functions with Hölder continuous Hessians. SIAM Journal on Optimization, 27(1):478–506, 2017.
- G. Grapiglia and Y. Nesterov. Accelerated regularized Newton methods for minimizing composite convex functions. SIAM Journal on Optimization, 29(1):77–99, 2019.
- G. Grapiglia and Y. Nesterov. Tensor methods for minimizing convex functions with Hölder continuous higher-order derivatives. SIAM Journal on Optimization, 30(4):2750–2779, 2020.
- G. Grapiglia and Y. Nesterov. On inexact solution of auxiliary problems in tensor methods for convex optimization. Optimization Methods and Software, 36(1):145–170, 2021.
- G. Grapiglia and Y. Nesterov. Adaptive third-order methods for composite convex optimization. SIAM Journal on Optimization, 33(3):1855–1883, 2023.
- Generalized descent methods for asymmetric systems of equations. Mathematics of Operations Research, 12(4):678–699, 1987.
- Finite-dimensional variational inequality and nonlinear complementarity problems: A survey of theory, algorithms and applications. Mathematical Programming, 48(1):161–220, 1990.
- P. Hartman and G. Stampacchia. On some non-linear elliptic differential-functional equations. Acta Mathematica, 115:271–310, 1966.
- K. Huang and S. Zhang. An approximation-based regularized extra-gradient method for monotone variational inequalities. ArXiv Preprint: 2210.04440, 2022.
- K. Huang and S. Zhang. Beyond monotone variational inequalities: Solution methods and iteration complexities. ArXiv Preprint: 2304.04153, 2023.
- Cubic regularized Newton method for the saddle point models: A global and local convergence analysis. Journal of Scientific Computing, 91(2):1–31, 2022.
- Extragradient method with variance reduction for stochastic variational inequalities. SIAM Journal on Optimization, 27(2):686–724, 2017.
- A unified adaptive tensor approximation scheme to accelerate composite convex optimization. SIAM Journal on Optimization, 30(4):2897–2926, 2020.
- R. Jiang and A. Mokhtari. Generalized optimistic methods for convex-concave saddle point problems. ArXiv Preprint: 2202.09674, 2022.
- A. Kannan and U. V. Shanbhag. Optimal stochastic extragradient schemes for pseudomonotone stochastic variational inequality problems and their variants. Computational Optimization and Applications, 74(3):779–820, 2019.
- D. Kinderlehrer and G. Stampacchia. An Introduction to Variational Inequalities and Their Applications. SIAM, 2000.
- An alternative view: When does SGD escape local minima? In ICML, pages 2698–2707. PMLR, 2018.
- G. Kornowski and O. Shamir. High-order oracle complexity of smooth and strongly convex optimization. ArXiv Preprint: 2010.06642, 2020.
- G. M. Korpelevich. The extragradient method for finding saddle points and other problems. Matecon, 12:747–756, 1976.
- Simple and optimal methods for stochastic variational inequalities, I: Operator extrapolation. SIAM Journal on Optimization, 32(3):2041–2073, 2022.
- D. Kovalev and A. Gasnikov. The first optimal acceleration of high-order methods in smooth convex optimization. In NeurIPS, pages 35339–35351, 2022.
- G. Lan and Y. Zhou. An optimal randomized incremental gradient method. Mathematical Programming, 171(1):167–215, 2018a.
- G. Lan and Y. Zhou. Random gradient extrapolation for distributed and stochastic optimization. SIAM Journal on Optimization, 28(4):2753–2782, 2018b.
- Equilibrium points of bimatrix games. Journal of the Society for Industrial and Applied Mathematics, 12(2):413–423, 1964.
- Y. Li and Y. Yuan. Convergence analysis of two-layer neural networks with ReLU activation. In NIPS, pages 597–607, 2017.
- T. Lin and M. I. Jordan. A control-theoretic perspective on optimal high-order optimization. Mathematical Programming, 195(1):929–975, 2022.
- T. Lin and M. I. Jordan. Monotone inclusions, acceleration, and closed-loop control. Mathematics of Operations Research, 48(4):2353–2382, 2023.
- Explicit second-order min-max optimization methods with optimal convergence guarantee. ArXiv Preprint: 2210.12860, 2022.
- First-order convergence theory for weakly-convex-weakly-concave min-max problems. Journal of Machine Learning Research, 22(169):1–34, 2021.
- Towards deep learning models resistant to adversarial attacks. In ICLR, 2018. URL https://openreview.net/forum?id=rJzIBfZAb.
- T. L. Magnanti and G. Perakis. A unifying geometric solution framework and complexity analysis for variational inequalities. Mathematical Programming, 71(3):327–351, 1995.
- T. L. Magnanti and G. Perakis. Averaging schemes for variational inequalities and systems of equations. Mathematics of Operations Research, 22(3):568–587, 1997a.
- T. L. Magnanti and G. Perakis. The orthogonality theorem and the strong-f-monotonicity condition for variational inequality algorithms. SIAM Journal on Optimization, 7(1):248–273, 1997b.
- T. L. Magnanti and G. Perakis. Solving variational inequality and fixed point problems by line searches and potential optimization. Mathematical Programming, 101(3):435–461, 2004.
- M. Marques Alves. Variants of the A-HPE and large-step A-HPE algorithms for strongly convex problems with applications to accelerated high-order tensor methods. Optimization Methods and Software, 37(6):2021–2051, 2022.
- J. Martínez. On high-order model regularization for constrained optimization. SIAM Journal on Optimization, 27(4):2447–2458, 2017.
- P. Mertikopoulos and Z. Zhou. Learning in games with continuous action sets and unknown payoff functions. Mathematical Programming, 173(1):465–507, 2019.
- G. J. Minty. Monotone (nonlinear) operators in Hilbert space. Duke Mathematical Journal, 29(3):341–346, 1962.
- Convergence rate of o(1/k) for optimistic gradient and extragradient methods in smooth convex-concave saddle point problems. SIAM Journal on Optimization, 30(4):3230–3251, 2020.
- On the complexity of the hybrid proximal extragradient method for the iterates and the ergodic mean. SIAM Journal on Optimization, 20(6):2755–2787, 2010.
- Complexity of variants of Tseng’s modified FB splitting and Korpelevich’s methods for hemivariational inequalities with applications to saddle-point and convex optimization problems. SIAM Journal on Optimization, 21(4):1688–1720, 2011.
- Iteration-complexity of a Newton proximal extragradient method for monotone variational inequalities and inclusion problems. SIAM Journal on Optimization, 22(3):914–935, 2012.
- An accelerated hybrid proximal extragradient method for convex optimization and its implications to second-order methods. SIAM Journal on Optimization, 23(2):1092–1125, 2013.
- Linear convergence of first order methods for non-strongly convex optimization. Mathematical Programming, 175(1):69–107, 2019.
- A. Nemirovski. Prox-method with rate of convergence o(1/t) for variational inequalities with Lipschitz continuous monotone operators and smooth convex-concave saddle point problems. SIAM Journal on Optimization, 15(1):229–251, 2004.
- A. Nemirovski and Y. Nesterov. Optimal methods of smooth convex minimization. USSR Computational Mathematics and Mathematical Physics, 25(2):21–30, 1985.
- Y. Nesterov. A method of solving a convex programming problem with convergence rate o(k^2). In Doklady Akademii Nauk, volume 269, pages 543–547. Russian Academy of Sciences, 1983.
- Y. Nesterov. Cubic regularization of Newton’s method for convex problems with constraints. Technical report, Université catholique de Louvain, Center for Operations Research and Econometrics (CORE), 2006.
- Y. Nesterov. Dual extrapolation and its applications to solving variational inequalities and related problems. Mathematical Programming, 109(2):319–344, 2007.
- Y. Nesterov. Accelerating the cubic regularization of Newton’s method on convex problems. Mathematical Programming, 112(1):159–181, 2008.
- Y. Nesterov. Gradient methods for minimizing composite functions. Mathematical Programming, 140(1):125–161, 2013.
- Y. Nesterov. Lectures on Convex Optimization, volume 137. Springer, 2018.
- Y. Nesterov. Inexact high-order proximal-point methods with auxiliary search procedure. SIAM Journal on Optimization, 31(4):2807–2828, 2021a.
- Y. Nesterov. Implementable tensor methods in unconstrained convex optimization. Mathematical Programming, 186(1):157–183, 2021b.
- Y. Nesterov. Inexact accelerated high-order proximal-point methods. Mathematical Programming, pages 1–26, 2021c.
- Y. Nesterov. Superfast second-order methods for unconstrained convex optimization. Journal of Optimization Theory and Applications, 191(1):1–30, 2021d.
- Y. Nesterov and B. Polyak. Cubic regularization of Newton method and its global performance. Mathematical Programming, 108(1):177–205, 2006.
- Tensor methods for strongly convex strongly concave saddle point problems and strongly monotone variational inequalities. ArXiv Preprint: 2012.15595, 2020.
- Y. Ouyang and Y. Xu. Lower complexity bounds of first-order methods for convex-concave bilinear saddle-point problems. Mathematical Programming, 185(1):1–35, 2021.
- B. O’donoghue and E. Candes. Adaptive restart for accelerated gradient schemes. Foundations of Computational Mathematics, 15(3):715–732, 2015.
- L. D. Popov. A modification of the Arrow-Hurwicz method for search of saddle points. Mathematical notes of the Academy of Sciences of the USSR, 28(5):845–848, 1980.
- D. Ralph and S. J. Wright. Superlinear convergence of an interior-point method for monotone variational inequalities. Complementarity and Variational Problems: State of the Art, pages 345–385, 1997.
- J. Renegar and B. Grimmer. A simple nearly optimal restart scheme for speeding up first-order methods. Foundations of Computational Mathematics, 22(1):211–256, 2022.
- Variational Analysis, volume 317. Springer Science & Business Media, 2009.
- V. Roulet and A. d’Aspremont. Sharpness, restart and acceleration. In NIPS, pages 1119–1129, 2017.
- H. Scarf. The approximation of fixed points of a continuous mapping. SIAM Journal on Applied Mathematics, 15(5):1328–1343, 1967.
- Certifiable distributional robustness with principled adversarial training. In ICLR, 2018. URL https://openreview.net/forum?id=Hk6kPgZA-.
- A new projection method for variational inequality problems. SIAM Journal on Control and Optimization, 37(3):765–776, 1999.
- Optimistic dual extrapolation for coherent non-monotone variational inequalities. In NeurIPS, pages 14303–14314, 2020.
- Unified acceleration of high-order algorithms under general Hölder continuity. SIAM Journal on Optimization, 31(3):1797–1826, 2021.
- Some adaptive first-order methods for variational inequalities with relatively strongly monotone operators and generalized smoothness. In ICOPTA, pages 135–150. Springer, 2022.
- M. J. Todd. The Computation of Fixed Points and Applications. Springer, 2013.
- Numerical Analysis of Variational Inequalities. Elsevier, 2011.
- P. Tseng. A modified forward-backward splitting method for maximal monotone mappings. SIAM Journal on Control and Optimization, 38(2):431–446, 2000.
- A variational perspective on accelerated methods in optimization. Proceedings of the National Academy of Sciences, 113(47):E7351–E7358, 2016.
- On lower iteration complexity bounds for the convex concave saddle point problems. Mathematical Programming, 194(1):901–935, 2022.