Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
149 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

High Dimensional Clustering with $r$-nets (1811.02288v1)

Published 6 Nov 2018 in cs.CG

Abstract: Clustering, a fundamental task in data science and machine learning, groups a set of objects in such a way that objects in the same cluster are closer to each other than to those in other clusters. In this paper, we consider a well-known structure, so-called $r$-nets, which rigorously captures the properties of clustering. We devise algorithms that improve the run-time of approximating $r$-nets in high-dimensional spaces with $\ell_1$ and $\ell_2$ metrics from $\tilde{O}(dn{2-\Theta(\sqrt{\epsilon})})$ to $\tilde{O}(dn + n{2-\alpha})$, where $\alpha = \Omega({\epsilon{1/3}}/{\log(1/\epsilon)})$. These algorithms are also used to improve a framework that provides approximate solutions to other high dimensional distance problems. Using this framework, several important related problems can also be solved efficiently, e.g., $(1+\epsilon)$-approximate $k$th-nearest neighbor distance, $(4+\epsilon)$-approximate Min-Max clustering, $(4+\epsilon)$-approximate $k$-center clustering. In addition, we build an algorithm that $(1+\epsilon)$-approximates greedy permutations in time $\tilde{O}((dn + n{2-\alpha}) \cdot \log{\Phi})$ where $\Phi$ is the spread of the input. This algorithm is used to $(2+\epsilon)$-approximate $k$-center with the same time complexity.

Citations (1)

Summary

We haven't generated a summary for this paper yet.