2000 character limit reached
Nearly Optimal Clustering Risk Bounds for Kernel K-Means (2003.03888v2)
Published 9 Mar 2020 in cs.LG and stat.ML
Abstract: In this paper, we study the statistical properties of kernel $k$-means and obtain a nearly optimal excess clustering risk bound, substantially improving the state-of-art bounds in the existing clustering risk analyses. We further analyze the statistical effect of computational approximations of the Nystr\"{o}m kernel $k$-means, and prove that it achieves the same statistical accuracy as the exact kernel $k$-means considering only $\Omega(\sqrt{nk})$ Nystr\"{o}m landmark points. To the best of our knowledge, such sharp excess clustering risk bounds for kernel (or approximate kernel) $k$-means have never been proposed before.