Balancing Gradient and Hessian Queries in Non-Convex Optimization (2510.20786v1)
Abstract: We develop optimization methods which offer new trade-offs between the number of gradient and Hessian computations needed to compute the critical point of a non-convex function. We provide a method that for any twice-differentiable $f\colon \mathbb Rd \rightarrow \mathbb R$ with $L_2$-Lipschitz Hessian, input initial point with $\Delta$-bounded sub-optimality, and sufficiently small $\epsilon > 0$, outputs an $\epsilon$-critical point, i.e., a point $x$ such that $|\nabla f(x)| \leq \epsilon$, using $\tilde{O}(L_2{1/4} n_H{-1/2}\Delta\epsilon{-9/4})$ queries to a gradient oracle and $n_H$ queries to a Hessian oracle for any positive integer $n_H$. As a consequence, we obtain an improved gradient query complexity of $\tilde{O}(d{1/3}L_2{1/2}\Delta\epsilon{-3/2})$ in the case of bounded dimension and of $\tilde{O}(L_2{3/4}\Delta{3/2}\epsilon{-9/4})$ in the case where we are allowed only a \emph{single} Hessian query. We obtain these results through a more general algorithm which can handle approximate Hessian computations and recovers the state-of-the-art bound of computing an $\epsilon$-critical point with $O(L_1{1/2}L_2{1/4}\Delta\epsilon{-7/4})$ gradient queries provided that $f$ also has an $L_1$-Lipschitz gradient.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.