Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
144 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Near-Optimal Mean Estimation with Unknown, Heteroskedastic Variances (2312.02417v1)

Published 5 Dec 2023 in math.ST, cs.DS, stat.ML, and stat.TH

Abstract: Given data drawn from a collection of Gaussian variables with a common mean but different and unknown variances, what is the best algorithm for estimating their common mean? We present an intuitive and efficient algorithm for this task. As different closed-form guarantees can be hard to compare, the Subset-of-Signals model serves as a benchmark for heteroskedastic mean estimation: given $n$ Gaussian variables with an unknown subset of $m$ variables having variance bounded by 1, what is the optimal estimation error as a function of $n$ and $m$? Our algorithm resolves this open question up to logarithmic factors, improving upon the previous best known estimation error by polynomial factors when $m = nc$ for all $0<c<1$. Of particular note, we obtain error $o(1)$ with $m = \tilde{O}(n{1/4})$ variance-bounded samples, whereas previous work required $m = \tilde{\Omega}(n{1/2})$. Finally, we show that in the multi-dimensional setting, even for $d=2$, our techniques enable rates comparable to knowing the variance of each sample.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com