Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
167 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Near-optimal Sample Complexity Bounds for Robust Learning of Gaussians Mixtures via Compression Schemes (1710.05209v5)

Published 14 Oct 2017 in cs.LG, math.ST, and stat.TH

Abstract: We prove that $\tilde{\Theta}(k d2 / \varepsilon2)$ samples are necessary and sufficient for learning a mixture of $k$ Gaussians in $\mathbb{R}d$, up to error $\varepsilon$ in total variation distance. This improves both the known upper bounds and lower bounds for this problem. For mixtures of axis-aligned Gaussians, we show that $\tilde{O}(k d / \varepsilon2)$ samples suffice, matching a known lower bound. Moreover, these results hold in the agnostic-learning/robust-estimation setting as well, where the target distribution is only approximately a mixture of Gaussians. The upper bound is shown using a novel technique for distribution learning based on a notion of `compression.' Any class of distributions that allows such a compression scheme can also be learned with few samples. Moreover, if a class of distributions has such a compression scheme, then so do the classes of products and mixtures of those distributions. The core of our main result is showing that the class of Gaussians in $\mathbb{R}d$ admits a small-sized compression scheme.

Citations (5)

Summary

We haven't generated a summary for this paper yet.