2000 character limit reached
Nearly-Optimal Hierarchical Clustering for Well-Clustered Graphs (2306.09950v1)
Published 16 Jun 2023 in cs.DS and cs.LG
Abstract: This paper presents two efficient hierarchical clustering (HC) algorithms with respect to Dasgupta's cost function. For any input graph $G$ with a clear cluster-structure, our designed algorithms run in nearly-linear time in the input size of $G$, and return an $O(1)$-approximate HC tree with respect to Dasgupta's cost function. We compare the performance of our algorithm against the previous state-of-the-art on synthetic and real-world datasets and show that our designed algorithm produces comparable or better HC trees with much lower running time.
- Hierarchical clustering: a 0.585 revenue approximation. In 33rd Annual Conference on Learning Theory (COLT’20), pages 153–162, 2020.
- Subquadratic high-dimensional hierarchical clustering. In Advances in Neural Information Processing Systems 33 (NeurIPS’19), pages 11576–11586, 2019.
- N. Alon. Eigenvalues and expanders. Combinatorica, 6(2):83–96, 1986.
- Expander flows, geometric embeddings and graph partitioning. Journal of the ACM, 56(2):1–37, 2009.
- The complexity of testing whether a graph is a superconcentrator. Information Processing Letters, 13(4-5):164–167, 1981.
- Hierarchical clustering beyond the worst-case. In Advances in Neural Information Processing Systems 31 (NeurIPS’17), pages 6201–6209, 2017.
- Hierarchical clustering: Objective functions and algorithms. Journal of the ACM, 66(4):1–42, 2019.
- Approximate hierarchical clustering via sparsest cut and spreading metrics. In 28th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA’17), pages 841–854, 2017.
- Hierarchical clustering better than average-linkage. In 30th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA’19), pages 2291–2304, 2019.
- Fan R. K. Chung. Spectral Graph Theory. American Mathematical Society, 1997.
- Bisect and conquer: Hierarchical clustering via max-uncut bisection. In 23rd International Conference on Artificial Intelligence and Statistics (AISTATS’20), pages 3121–3132, 2020.
- Sanjoy Dasgupta. A cost function for similarity-based hierarchical clustering. In 48th Annual ACM Symposium on Theory of Computing (STOC’16), pages 118–127, 2016.
- UCI machine learning repository, 2017.
- Learning hierarchical structure of clusterable graphs. Arxiv, 2207.02581, 2022.
- Multiway spectral partitioning and higher-order Cheeger inequalities. Journal of the ACM, 61(6):1–30, 2014.
- Frank McSherry. Spectral partitioning of random graphs. In 42nd Annual IEEE Symposium on Foundations of Computer Science (FOCS’01), pages 529–537, 2001.
- Tomohiko Mizutani. Improved analysis of spectral algorithm for clustering. Optimization Letters, 15(4):1303–1325, 2021.
- Online hierarchical clustering approximations. Arxiv, :1909.09667, 2019.
- Bogdan-Adrian Manghiuc and He Sun. Hierarchical Clustering: O(1)-Approximation for Well-Clustered Graphs. In Advances in Neural Information Processing Systems 35 (NeurIPS’21), pages 9278–9289, 2021.
- Peter Macgregor and He Sun. A tighter analysis of spectral clustering, and beyond. In 39th International Conference on Machine Learning (ICML’22), pages 14717–14742, 2022.
- Approximation bounds for hierarchical clustering: Average linkage, bisecting k𝑘kitalic_k-means, and local search. In Advances in Neural Information Processing Systems 31 (NeurIPS’17), pages 3094–3103, 2017.
- On spectral clustering: Analysis and an algorithm. In Advances in Neural Information Processing Systems 15 (NeurIPS’01), pages 849–856, 2001.
- Partitioning into expanders. In 25th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA’14), pages 1256–1266, 2014.
- Partitioning Well-Clustered Graphs: Spectral Clustering Works! SIAM Journal on Computing, 46(2):710–743, 2017.
- Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825–2830, 2011.
- Gemsec: Graph embedding with self clustering. In 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, pages 65–72, 2019.
- Hierarchical clustering via spreading metrics. The Journal of Machine Learning Research, 18(1):3077–3111, 2017.
- Spectral partitioning works: Planar graphs and finite element meshes. In 37th Annual IEEE Symposium on Foundations of Computer Science (FOCS’96), pages 96–105, 1996.