The Complexity of Dynamic Least-Squares Regression (2201.00228v2)
Abstract: We settle the complexity of dynamic least-squares regression (LSR), where rows and labels $(\mathbf{A}{(t)}, \mathbf{b}{(t)})$ can be adaptively inserted and/or deleted, and the goal is to efficiently maintain an $\epsilon$-approximate solution to $\min_{\mathbf{x}{(t)}} | \mathbf{A}{(t)} \mathbf{x}{(t)} - \mathbf{b}{(t)} |_2$ for all $t\in [T]$. We prove sharp separations ($d{2-o(1)}$ vs. $\sim d$) between the amortized update time of: (i) Fully vs. Partially dynamic $0.01$-LSR; (ii) High vs. low-accuracy LSR in the partially-dynamic (insertion-only) setting. Our lower bounds follow from a gap-amplification reduction -- reminiscent of iterative refinement -- rom the exact version of the Online Matrix Vector Conjecture (OMv) [HKNS15], to constant approximate OMv over the reals, where the $i$-th online product $\mathbf{H}\mathbf{v}{(i)}$ only needs to be computed to $0.1$-relative error. All previous fine-grained reductions from OMv to its approximate versions only show hardness for inverse polynomial approximation $\epsilon = n{-\omega(1)}$ (additive or multiplicative) . This result is of independent interest in fine-grained complexity and for the investigation of the OMv Conjecture, which is still widely open.