Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Local correlation clustering (1312.5105v1)

Published 18 Dec 2013 in cs.DS

Abstract: Correlation clustering is perhaps the most natural formulation of clustering. Given $n$ objects and a pairwise similarity measure, the goal is to cluster the objects so that, to the best possible extent, similar objects are put in the same cluster and dissimilar objects are put in different clusters. Despite its theoretical appeal, the practical relevance of correlation clustering still remains largely unexplored, mainly due to the fact that correlation clustering requires the $\Theta(n2)$ pairwise similarities as input. In this paper we initiate the investigation into \emph{local} algorithms for correlation clustering. In \emph{local correlation clustering} we are given the identifier of a single object and we want to return the cluster to which it belongs in some globally consistent near-optimal clustering, using a small number of similarity queries. Local algorithms for correlation clustering open the door to \emph{sublinear-time} algorithms, which are particularly useful when the similarity between items is costly to compute, as it is often the case in many practical application domains. They also imply $(i)$ distributed and streaming clustering algorithms, $(ii)$ constant-time estimators and testers for cluster edit distance, and $(iii)$ property-preserving parallel reconstruction algorithms for clusterability. Specifically, we devise a local clustering algorithm attaining a $(3, \varepsilon)$-approximation in time $O(1/\varepsilon2)$ independently of the dataset size. An explicit approximate clustering for all objects can be produced in time $O(n/\varepsilon)$ (which is provably optimal). We also provide a fully additive $(1,\varepsilon)$-approximation with local query complexity $poly(1/\varepsilon)$ and time complexity $2{poly(1/\varepsilon)}$. The latter yields the fastest polynomial-time approximation scheme for correlation clustering known to date.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Francesco Bonchi (73 papers)
  2. Konstantin Kutzkov (12 papers)
  3. David GarcĂ­a-Soriano (5 papers)
Citations (12)

Summary

We haven't generated a summary for this paper yet.