Papers
Topics
Authors
Recent
Search
2000 character limit reached

On Coresets for Clustering in Small Dimensional Euclidean Spaces

Published 27 Feb 2023 in cs.DS | (2302.13737v1)

Abstract: We consider the problem of constructing small coresets for $k$-Median in Euclidean spaces. Given a large set of data points $P\subset \mathbb{R}d$, a coreset is a much smaller set $S\subset \mathbb{R}d$, so that the $k$-Median costs of any $k$ centers w.r.t. $P$ and $S$ are close. Existing literature mainly focuses on the high-dimension case and there has been great success in obtaining dimension-independent bounds, whereas the case for small $d$ is largely unexplored. Considering many applications of Euclidean clustering algorithms are in small dimensions and the lack of systematic studies in the current literature, this paper investigates coresets for $k$-Median in small dimensions. For small $d$, a natural question is whether existing near-optimal dimension-independent bounds can be significantly improved. We provide affirmative answers to this question for a range of parameters. Moreover, new lower bound results are also proved, which are the highest for small $d$. In particular, we completely settle the coreset size bound for $1$-d $k$-Median (up to log factors). Interestingly, our results imply a strong separation between $1$-d $1$-Median and $1$-d $2$-Median. As far as we know, this is the first such separation between $k=1$ and $k=2$ in any dimension.

Citations (5)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.