Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
133 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

On the Complexity Analysis of the Primal Solutions for the Accelerated Randomized Dual Coordinate Ascent (1807.00261v2)

Published 1 Jul 2018 in math.NA, cs.NA, and math.OC

Abstract: Dual first-order methods are essential techniques for large-scale constrained convex optimization. However, when recovering the primal solutions, we need $T(\epsilon{-2})$ iterations to achieve an $\epsilon$-optimal primal solution when we apply an algorithm to the non-strongly convex dual problem with $T(\epsilon{-1})$ iterations to achieve an $\epsilon$-optimal dual solution, where $T(x)$ can be $x$ or $\sqrt{x}$. In this paper, we prove that the iteration complexity of the primal solutions and dual solutions have the same $O\left(\frac{1}{\sqrt{\epsilon}}\right)$ order of magnitude for the accelerated randomized dual coordinate ascent. When the dual function further satisfies the quadratic functional growth condition, by restarting the algorithm at any period, we establish the linear iteration complexity for both the primal solutions and dual solutions even if the condition number is unknown. When applied to the regularized empirical risk minimization problem, we prove the iteration complexity of $O\left(n\log n+\sqrt{\frac{n}{\epsilon}}\right)$ in both primal space and dual space, where $n$ is the number of samples. Our result takes out the $\left(\log \frac{1}{\epsilon}\right)$ factor compared with the methods based on smoothing/regularization or Catalyst reduction. As far as we know, this is the first time that the optimal $O\left(\sqrt{\frac{n}{\epsilon}}\right)$ iteration complexity in the primal space is established for the dual coordinate ascent based stochastic algorithms. We also establish the accelerated linear complexity for some problems with nonsmooth loss, i.e., the least absolute deviation and SVM.

Citations (8)

Summary

We haven't generated a summary for this paper yet.