Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
156 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Gradient methods for convex minimization: better rates under weaker conditions (1303.4645v2)

Published 19 Mar 2013 in math.OC, cs.IT, math.IT, and math.NA

Abstract: The convergence behavior of gradient methods for minimizing convex differentiable functions is one of the core questions in convex optimization. This paper shows that their well-known complexities can be achieved under conditions weaker than the commonly accepted ones. We relax the common gradient Lipschitz-continuity condition and strong convexity condition to ones that hold only over certain line segments. Specifically, we establish complexities $O(\frac{R}{\epsilon})$ and $O(\sqrt{\frac{R}{\epsilon}})$ for the ordinary and accelerate gradient methods, respectively, assuming that $\nabla f$ is Lipschitz continuous with constant $R$ over the line segment joining $x$ and $x-\frac{1}{R}\nabla f$ for each $x\in\dom f$. Then we improve them to $O(\frac{R}{\nu}\log(\frac{1}{\epsilon}))$ and $O(\sqrt{\frac{R}{\nu}}\log(\frac{1}{\epsilon}))$ for function $f$ that also satisfies the secant inequality $\ < \nabla f(x), x- x*\ > \ge \nu|x-x*|2$ for each $x\in \dom f$ and its projection $x*$ to the minimizer set of $f$. The secant condition is also shown to be necessary for the geometric decay of solution error. Not only are the relaxed conditions met by more functions, the restrictions give smaller $R$ and larger $\nu$ than they are without the restrictions and thus lead to better complexity bounds. We apply these results to sparse optimization and demonstrate a faster algorithm.

Citations (82)

Summary

We haven't generated a summary for this paper yet.