Online Learning Guided Quasi-Newton Methods with Global Non-Asymptotic Convergence (2410.02626v1)

Published 3 Oct 2024 in math.OC, cs.LG, and stat.ML

Abstract: In this paper, we propose a quasi-Newton method for solving smooth and monotone nonlinear equations, including unconstrained minimization and minimax optimization as special cases. For the strongly monotone setting, we establish two global convergence bounds: (i) a linear convergence rate that matches the rate of the celebrated extragradient method, and (ii) an explicit global superlinear convergence rate that provably surpasses the linear convergence rate after at most ${O}(d)$ iterations, where $d$ is the problem's dimension. In addition, for the case where the operator is only monotone, we prove a global convergence rate of ${O}(\min{{1}/{k},{\sqrt{d}}/{k^{1.25}}})$ in terms of the duality gap. This matches the rate of the extragradient method when $k = {O}(d^2)$ and is faster when $k = \Omega(d^2)$. These results are the first global convergence results to demonstrate a provable advantage of a quasi-Newton method over the extragradient method, without querying the Jacobian of the operator. Unlike classical quasi-Newton methods, we achieve this by using the hybrid proximal extragradient framework and a novel online learning approach for updating the Jacobian approximation matrices. Specifically, guided by the convergence analysis, we formulate the Jacobian approximation update as an online convex optimization problem over non-symmetric matrices, relating the regret of the online problem to the convergence rate of our method. To facilitate efficient implementation, we further develop a tailored online learning algorithm based on an approximate separation oracle, which preserves structures such as symmetry and sparsity in the Jacobian matrices.

Citations (1)

View on Semantic Scholar

Summary

The paper introduces a quasi-Newton method that leverages online learning to update Jacobian approximations for solving smooth and monotone nonlinear equations.
It guarantees global convergence with a linear rate for strongly monotone operators and an explicit superlinear rate in O(d) iterations.
The hybrid framework integrates online convex optimization to preserve sparsity and maintain computational efficiency in high-dimensional settings.

Online Learning Guided Quasi-Newton Methods with Global Non-Asymptotic Convergence

The paper "Online Learning Guided Quasi-Newton Methods with Global Non-Asymptotic Convergence" introduces a novel quasi-Newton method designed to solve smooth and monotone nonlinear equations. This approach encompasses unconstrained minimization and minimax optimization as special cases. The focus is on addressing the limitations of classical quasi-Newton methods by leveraging an online learning approach to update the Jacobian approximation matrices.

The proposed algorithm stands out by offering global convergence guarantees in both strongly monotone and monotone settings. For strongly monotone operators, the algorithm achieves a linear convergence rate comparable to the extragradient method. More notably, it surpasses this performance with an explicit global superlinear convergence rate, achieved after at most $\mathcal{O}(d)$ iterations, where $d$ is the problem's dimension. This represents a significant theoretical advancement, demonstrating a quasi-Newton method's provable advantage over the extragradient method without querying the Jacobian.

For merely monotone operators, the paper establishes a global convergence rate of $\mathcal{O}(\min\{\frac{1}{k},\frac{\sqrt{d}{k^{1.25}\})$, matching the extragradient method for $k = \mathcal{O}(d^2)$ and improving upon it when $k = \Omega(d^2)$ . These results highlight the algorithm's efficacy in settings where classical methods may fall short.

The research adopts a hybrid proximal extragradient framework, innovatively integrating an online convex optimization strategy for updating Jacobian approximation matrices. This unique approach is informed by convergence analysis, with the regret of the online problem guiding improvements to the algorithm's convergence rate. This methodology retains computational efficiency by employing a tailored online learning algorithm based on an approximate separation oracle, ensuring the preservation of symmetry and sparsity in Jacobian matrices where applicable.

From a computational perspective, the algorithm demonstrates practical efficiency through bounded matrix-vector product complexity, offering clear advantages over traditional methods in data-intensive and high-dimensional settings. The method's ability to maintain favorable computational requirements while achieving superior convergence rates positions it as a valuable tool for complex optimization problems.

The implications of this research are significant. Practically, it opens pathways for efficiently solving large-scale optimization problems commonly encountered in machine learning and engineering. Theoretically, it enriches our understanding of quasi-Newton methods, offering new insights into their potential beyond classical constraints.

Future developments could extend this framework to broader classes of optimization problems and explore further enhancements to the online learning components for even greater efficiency and flexibility. As the complexity and scale of computational problems continue to grow, such advances are increasingly critical.

PDF Markdown

Related Papers

Tweets

https://twitter.com/mathOCb/status/1842084142189838671