Papers
Topics
Authors
Recent
Search
2000 character limit reached

Simple and Optimal Sublinear Algorithms for Mean Estimation

Published 7 Jun 2024 in cs.DS | (2406.05254v3)

Abstract: We study the sublinear multivariate mean estimation problem in $d$-dimensional Euclidean space. Specifically, we aim to find the mean $\mu$ of a ground point set $A$, which minimizes the sum of squared Euclidean distances of the points in $A$ to $\mu$. We first show that a multiplicative $(1+\varepsilon)$ approximation to $\mu$ can be found with probability $1-\delta$ using $O(\varepsilon{-1}\log \delta{-1})$ many independent uniform random samples, and provide a matching lower bound. Furthermore, we give two sublinear time algorithms with optimal sample complexity for extracting a suitable approximate mean: 1. A gradient descent approach running in time $O((\varepsilon{-1}+\log\log \delta{-1})\cdot \log \delta{-1} \cdot d)$. It optimizes the geometric median objective while being significantly faster for our specific setting than all other known algorithms for this problem. 2. An order statistics and clustering approach running in time $O\left((\varepsilon{-1}+\log{\gamma}\delta{-1})\cdot \log \delta{-1} \cdot d\right)$ for any constant $\gamma>0$. Throughout our analysis, we also generalize the familiar median-of-means estimator to the multivariate case, showing that the geometric median-of-means estimator achieves an optimal sample complexity for estimating $\mu$, which may be of independent interest.

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.