Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
162 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Leverage Score Sampling for Faster Accelerated Regression and ERM (1711.08426v1)

Published 22 Nov 2017 in stat.ML, cs.LG, and math.OC

Abstract: Given a matrix $\mathbf{A}\in\mathbb{R}{n\times d}$ and a vector $b \in\mathbb{R}{d}$, we show how to compute an $\epsilon$-approximate solution to the regression problem $ \min_{x\in\mathbb{R}{d}}\frac{1}{2} |\mathbf{A} x - b|{2}{2} $ in time $ \tilde{O} ((n+\sqrt{d\cdot\kappa{\text{sum}}})\cdot s\cdot\log\epsilon{-1}) $ where $\kappa_{\text{sum}}=\mathrm{tr}\left(\mathbf{A}{\top}\mathbf{A}\right)/\lambda_{\min}(\mathbf{A}{T}\mathbf{A})$ and $s$ is the maximum number of non-zero entries in a row of $\mathbf{A}$. Our algorithm improves upon the previous best running time of $ \tilde{O} ((n+\sqrt{n \cdot\kappa_{\text{sum}}})\cdot s\cdot\log\epsilon{-1})$. We achieve our result through a careful combination of leverage score sampling techniques, proximal point methods, and accelerated coordinate descent. Our method not only matches the performance of previous methods, but further improves whenever leverage scores of rows are small (up to polylogarithmic factors). We also provide a non-linear generalization of these results that improves the running time for solving a broader class of ERM problems.

Citations (21)

Summary

We haven't generated a summary for this paper yet.