Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Sharp kernel clustering algorithms and their associated Grothendieck inequalities (0906.4816v1)

Published 25 Jun 2009 in cs.DS and cs.CC

Abstract: In the kernel clustering problem we are given a (large) $n\times n$ symmetric positive semidefinite matrix $A=(a_{ij})$ with $\sum_{i=1}n\sum_{j=1}n a_{ij}=0$ and a (small) $k\times k$ symmetric positive semidefinite matrix $B=(b_{ij})$. The goal is to find a partition ${S_1,...,S_k}$ of ${1,... n}$ which maximizes $ \sum_{i=1}k\sum_{j=1}k (\sum_{(p,q)\in S_i\times S_j}a_{pq})b_{ij}$. We design a polynomial time approximation algorithm that achieves an approximation ratio of $\frac{R(B)2}{C(B)}$, where $R(B)$ and $C(B)$ are geometric parameters that depend only on the matrix $B$, defined as follows: if $b_{ij} = < v_i, v_j>$ is the Gram matrix representation of $B$ for some $v_1,...,v_k\in \Rk$ then $R(B)$ is the minimum radius of a Euclidean ball containing the points ${v_1, ..., v_k}$. The parameter $C(B)$ is defined as the maximum over all measurable partitions ${A_1,...,A_k}$ of $\R{k-1}$ of the quantity $\sum_{i=1}k\sum_{j=1}k b_{ij}< z_i,z_j>$, where for $i\in {1,...,k}$ the vector $z_i\in \R{k-1}$ is the Gaussian moment of $A_i$, i.e., $z_i=\frac{1}{(2\pi){(k-1)/2}}\int_{A_i}xe{-|x|_22/2}dx$. We also show that for every $\eps > 0$, achieving an approximation guarantee of $(1-\e)\frac{R(B)2}{C(B)}$ is Unique Games hard.

Citations (25)

Summary

We haven't generated a summary for this paper yet.