Algorithms for Discrepancy, Matchings, and Approximations: Fast, Simple, and Practical (2209.01147v1)
Abstract: We study one of the key tools in data approximation and optimization: low-discrepancy colorings. Formally, given a finite set system $(X,\mathcal S)$, the \emph{discrepancy} of a two-coloring $\chi:X\to{-1,1}$ is defined as $\max_{S \in \mathcal S}|{\chi(S)}|$, where $\chi(S)=\sum\limits_{x \in S}\chi(x)$. We propose a randomized algorithm which, for any $d>0$ and $(X,\mathcal S)$ with dual shatter function $\pi*(k)=O(kd)$, returns a coloring with expected discrepancy $O\left({\sqrt{|X|{1-1/d}\log|\mathcal S|}}\right)$ (this bound is tight) in time $\tilde O\left({|\mathcal S|\cdot|X|{1/d}+|X|{2+1/d}}\right)$, improving upon the previous-best time of $O\left(|\mathcal S|\cdot|X|3\right)$ by at least a factor of $|X|{2-1/d}$ when $|\mathcal S|\geq|X|$. This setup includes many geometric classes, families of bounded dual VC-dimension, and others. As an immediate consequence, we obtain an improved algorithm to construct $\varepsilon$-approximations of sub-quadratic size. Our method uses primal-dual reweighing with an improved analysis of randomly updated weights and exploits the structural properties of the set system via matchings with low crossing number -- a fundamental structure in computational geometry. In particular, we get the same $|X|{2-1/d}$ factor speed-up on the construction time of matchings with crossing number $O\left({|X|{1-1/d}}\right)$, which is the first improvement since the 1980s. The proposed algorithms are very simple, which makes it possible, for the first time, to compute colorings with near-optimal discrepancies and near-optimal sized approximations for abstract and geometric set systems in dimensions higher than $2$.