2000 character limit reached
First-order algorithms converge faster than $O(1/k)$ on convex problems
Published 20 Dec 2018 in math.OC | (1812.08485v4)
Abstract: It is well known that both gradient descent and stochastic coordinate descent achieve a global convergence rate of $O(1/k)$ in the objective value, when applied to a scheme for minimizing a Lipschitz-continuously differentiable, unconstrained convex function. In this work, we improve this rate to $o(1/k)$. We extend the result to proximal gradient and proximal coordinate descent on regularized problems to show similar $o(1/k)$ convergence rates. The result is tight in the sense that a rate of $O(1/k{1+\epsilon})$ is not generally attainable for any $\epsilon>0$, for any of these methods.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.