Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
184 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Nearly-Optimal Hierarchical Clustering for Well-Clustered Graphs (2306.09950v1)

Published 16 Jun 2023 in cs.DS and cs.LG

Abstract: This paper presents two efficient hierarchical clustering (HC) algorithms with respect to Dasgupta's cost function. For any input graph $G$ with a clear cluster-structure, our designed algorithms run in nearly-linear time in the input size of $G$, and return an $O(1)$-approximate HC tree with respect to Dasgupta's cost function. We compare the performance of our algorithm against the previous state-of-the-art on synthetic and real-world datasets and show that our designed algorithm produces comparable or better HC trees with much lower running time.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (28)
  1. Hierarchical clustering: a 0.585 revenue approximation. In 33rd Annual Conference on Learning Theory (COLT’20), pages 153–162, 2020.
  2. Subquadratic high-dimensional hierarchical clustering. In Advances in Neural Information Processing Systems 33 (NeurIPS’19), pages 11576–11586, 2019.
  3. N. Alon. Eigenvalues and expanders. Combinatorica, 6(2):83–96, 1986.
  4. Expander flows, geometric embeddings and graph partitioning. Journal of the ACM, 56(2):1–37, 2009.
  5. The complexity of testing whether a graph is a superconcentrator. Information Processing Letters, 13(4-5):164–167, 1981.
  6. Hierarchical clustering beyond the worst-case. In Advances in Neural Information Processing Systems 31 (NeurIPS’17), pages 6201–6209, 2017.
  7. Hierarchical clustering: Objective functions and algorithms. Journal of the ACM, 66(4):1–42, 2019.
  8. Approximate hierarchical clustering via sparsest cut and spreading metrics. In 28th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA’17), pages 841–854, 2017.
  9. Hierarchical clustering better than average-linkage. In 30th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA’19), pages 2291–2304, 2019.
  10. Fan R. K. Chung. Spectral Graph Theory. American Mathematical Society, 1997.
  11. Bisect and conquer: Hierarchical clustering via max-uncut bisection. In 23rd International Conference on Artificial Intelligence and Statistics (AISTATS’20), pages 3121–3132, 2020.
  12. Sanjoy Dasgupta. A cost function for similarity-based hierarchical clustering. In 48th Annual ACM Symposium on Theory of Computing (STOC’16), pages 118–127, 2016.
  13. UCI machine learning repository, 2017.
  14. Learning hierarchical structure of clusterable graphs. Arxiv, 2207.02581, 2022.
  15. Multiway spectral partitioning and higher-order Cheeger inequalities. Journal of the ACM, 61(6):1–30, 2014.
  16. Frank McSherry. Spectral partitioning of random graphs. In 42nd Annual IEEE Symposium on Foundations of Computer Science (FOCS’01), pages 529–537, 2001.
  17. Tomohiko Mizutani. Improved analysis of spectral algorithm for clustering. Optimization Letters, 15(4):1303–1325, 2021.
  18. Online hierarchical clustering approximations. Arxiv, :1909.09667, 2019.
  19. Bogdan-Adrian Manghiuc and He Sun. Hierarchical Clustering: O(1)-Approximation for Well-Clustered Graphs. In Advances in Neural Information Processing Systems 35 (NeurIPS’21), pages 9278–9289, 2021.
  20. Peter Macgregor and He Sun. A tighter analysis of spectral clustering, and beyond. In 39th International Conference on Machine Learning (ICML’22), pages 14717–14742, 2022.
  21. Approximation bounds for hierarchical clustering: Average linkage, bisecting k𝑘kitalic_k-means, and local search. In Advances in Neural Information Processing Systems 31 (NeurIPS’17), pages 3094–3103, 2017.
  22. On spectral clustering: Analysis and an algorithm. In Advances in Neural Information Processing Systems 15 (NeurIPS’01), pages 849–856, 2001.
  23. Partitioning into expanders. In 25th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA’14), pages 1256–1266, 2014.
  24. Partitioning Well-Clustered Graphs: Spectral Clustering Works! SIAM Journal on Computing, 46(2):710–743, 2017.
  25. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825–2830, 2011.
  26. Gemsec: Graph embedding with self clustering. In 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, pages 65–72, 2019.
  27. Hierarchical clustering via spreading metrics. The Journal of Machine Learning Research, 18(1):3077–3111, 2017.
  28. Spectral partitioning works: Planar graphs and finite element meshes. In 37th Annual IEEE Symposium on Foundations of Computer Science (FOCS’96), pages 96–105, 1996.
Citations (3)

Summary

We haven't generated a summary for this paper yet.