The $\ell_p$-Subspace Sketch Problem in Small Dimensions with Applications to Support Vector Machines (2211.07132v2)
Abstract: In the $\ell_p$-subspace sketch problem, we are given an $n\times d$ matrix $A$ with $n>d$, and asked to build a small memory data structure $Q(A,\epsilon)$ so that, for any query vector $x\in\mathbb{R}d$, we can output a number in $(1\pm\epsilon)|Ax|_pp$ given only $Q(A,\epsilon)$. This problem is known to require $\tilde{\Omega}(d\epsilon{-2})$ bits of memory for $d=\Omega(\log(1/\epsilon))$. However, for $d=o(\log(1/\epsilon))$, no data structure lower bounds were known. We resolve the memory required to solve the $\ell_p$-subspace sketch problem for any constant $d$ and integer $p$, showing that it is $\Omega(\epsilon{-2(d-1)/(d+2p)})$ bits and $\tilde{O} (\epsilon{-2(d-1)/(d+2p)})$ words. This shows that one can beat the $\Omega(\epsilon{-2})$ lower bound, which holds for $d = \Omega(\log(1/\epsilon))$, for any constant $d$. We also show how to implement the upper bound in a single pass stream, with an additional multiplicative $\operatorname{poly}(\log \log n)$ factor and an additive $\operatorname{poly}(\log n)$ cost in the memory. Our bounds can be applied to point queries for SVMs with additive error, yielding an optimal bound of $\tilde{\Theta}(\epsilon{-2d/(d+3)})$ for every constant $d$. This is a near-quadratic improvement over the $\Omega(\epsilon{-(d+1)/(d+3)})$ lower bound of (Andoni et al. 2020). Our techniques rely on a novel connection to low dimensional techniques from geometric functional analysis.
- Yi Li (483 papers)
- Honghao Lin (17 papers)
- David P. Woodruff (207 papers)