Papers
Topics
Authors
Recent
Search
2000 character limit reached

Local Search k-means++ with Foresight

Published 4 Jun 2024 in cs.DS | (2406.02739v1)

Abstract: Since its introduction in 1957, Lloyd's algorithm for $k$-means clustering has been extensively studied and has undergone several improvements. While in its original form it does not guarantee any approximation factor at all, Arthur and Vassilvitskii (SODA 2007) proposed $k$-means++ which enhances Lloyd's algorithm by a seeding method which guarantees a $\mathcal{O}(\log k)$-approximation in expectation. More recently, Lattanzi and Sohler (ICML 2019) proposed LS++ which further improves the solution quality of $k$-means++ by local search techniques to obtain a $\mathcal{O}(1)$-approximation. On the practical side, the greedy variant of $k$-means++ is often used although its worst-case behaviour is provably worse than for the standard $k$-means++ variant. We investigate how to improve LS++ further in practice. We study two options for improving the practical performance: (a) Combining LS++ with greedy $k$-means++ instead of $k$-means++, and (b) Improving LS++ by better entangling it with Lloyd's algorithm. Option (a) worsens the theoretical guarantees of $k$-means++ but improves the practical quality also in combination with LS++ as we confirm in our experiments. Option (b) is our new algorithm, Foresight LS++. We experimentally show that FLS++ improves upon the solution quality of LS++. It retains its asymptotic runtime and its worst-case approximation bounds.

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.