Sharp kernel clustering algorithms and their associated Grothendieck inequalities (0906.4816v1)

Published 25 Jun 2009 in cs.DS and cs.CC

Abstract: In the kernel clustering problem we are given a (large) $n\times n$ symmetric positive semidefinite matrix $A=(a_{ij})$ with $\sum_{i=1}^{n\sum_{j=1}ⁿ} a_{ij}=0$ and a (small) $k\times k$ symmetric positive semidefinite matrix $B=(b_{ij})$. The goal is to find a partition ${S_1,...,S_k}$ of ${1,... n}$ which maximizes $ \sum_{i=1}^{k\sum_{j=1}^k} (\sum_{(p,q)\in S_i\times S_j}a_{pq})b_{ij}$. We design a polynomial time approximation algorithm that achieves an approximation ratio of $\frac{R(B)^2}{C(B)}$, where $R(B)$ and $C(B)$ are geometric parameters that depend only on the matrix $B$, defined as follows: if $b_{ij} = < v_i, v_j>$ is the Gram matrix representation of $B$ for some $v_1,...,v_k\in \R^k$ then $R(B)$ is the minimum radius of a Euclidean ball containing the points ${v_1, ..., v_k}$. The parameter $C(B)$ is defined as the maximum over all measurable partitions ${A_1,...,A_k}$ of $\R^{k-1}$ of the quantity $\sum_{i=1}^{k\sum_{j=1}^k} b_{ij}< z_i,z_j>$, where for $i\in {1,...,k}$ the vector $z_i\in \R^{k-1}$ is the Gaussian moment of $A_i$, i.e., $z_i=\frac{1}{(2\pi)^{{(k-1)/2}}\int_{A_i}xe^{{-|x|_2^2/2}dx$.}} We also show that for every $\eps > 0$, achieving an approximation guarantee of $(1-\e)\frac{R(B)^2}{C(B)}$ is Unique Games hard.

Citations (25)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Sharp kernel clustering algorithms and their associated Grothendieck inequalities (0906.4816v1)

Summary

Related Papers