Discrepancy Minimization in Input-Sparsity Time (2210.12468v1)
Abstract: A recent work of Larsen [Lar23] gave a faster combinatorial alternative to Bansal's SDP algorithm for finding a coloring $x\in{-1,1}n$ that approximately minimizes the discrepancy $\mathrm{disc}(A,x) : = | A x |_{\infty}$ of a general real-valued $m\times n$ matrix $A$. Larsen's algorithm runs in $\widetilde{O}(mn2)$ time compared to Bansal's $\widetilde{O}(mn{4.5})$-time algorithm, at the price of a slightly weaker logarithmic approximation ratio in terms of the hereditary discrepancy of $A$ [Ban10]. In this work we present a combinatorial $\widetilde{O}(\mathrm{nnz}(A) + n3)$ time algorithm with the same approximation guarantee as Larsen, which is optimal for tall matrices $m=\mathrm{poly}(n)$. Using a more intricate analysis and fast matrix-multiplication, we achieve $\widetilde{O}(\mathrm{nnz}(A) + n{2.53})$ time, which breaks cubic runtime for square matrices, and bypasses the barrier of linear-programming approaches [ES14] for which input-sparsity time is currently out of reach. Our algorithm relies on two main ideas: (i) A new sketching technique for finding a projection matrix with short $\ell_2$-basis using implicit leverage-score sampling; (ii) A data structure for faster implementation of the iterative Edge-Walk partial-coloring algorithm of Lovett-Meka, using an alternative analysis that enables ``lazy" batch-updates with low-rank corrections. Our result nearly closes the computational gap between real-valued and binary matrices (set-systems), for which input-sparsity time coloring was very recently obtained [JSS23].