Convex Clustering through MM: An Efficient Algorithm to Perform Hierarchical Clustering (2211.01877v2)
Abstract: Convex clustering is a modern method with both hierarchical and $k$-means clustering characteristics. Although convex clustering can capture complex clustering structures hidden in data, the existing convex clustering algorithms are not scalable to large data sets with sample sizes greater than several thousands. Moreover, it is known that convex clustering sometimes fails to produce a complete hierarchical clustering structure. This issue arises if clusters split up or the minimum number of possible clusters is larger than the desired number of clusters. In this paper, we propose convex clustering through majorization-minimization (CCMM) -- an iterative algorithm that uses cluster fusions and a highly efficient updating scheme derived using diagonal majorization. Additionally, we explore different strategies to ensure that the hierarchical clustering structure terminates in a single cluster. With a current desktop computer, CCMM efficiently solves convex clustering problems featuring over one million objects in seven-dimensional space, achieving a solution time of 51 seconds on average.
- RANN: Fast Nearest Neighbour Search (Wraps ANN Library) Using L2 Metric, 2019. URL https://CRAN.R-project.org/package=RANN. R package version 2.6.1.
- J. L. Bentley. Multidimensional Binary Search Trees Used for Associative Searching. Communications of the ACM, 18(9):509–517, 1975.
- J. Bolte and E. Pauwels. Majorization-Minimization Procedures and Convergence of SQP Methods for Semi-Algebraic and Tame Programs. Mathematics of Operations Research, 41(2):442–465, 2016.
- Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers. Foundations and Trends® in Machine learning, 3(1):1–122, 2011.
- R. Bronson. Schaum’s Outline of Theory and Problems of Matrix Operations. McGraw-Hill, 1989.
- Convex Clustering: An Attractive Alternative to Hierarchical Clustering. PLoS Computational Biology, 11(5):1–31, 2015.
- Splitting Methods for Convex Clustering. Journal of Computational and Graphical Statistics, 24(4):994–1013, 2015.
- E. C. Chi and S. Steinerberger. Recovering Trees with Convex Clustering. SIAM Journal on Mathematics of Data Science, 1(3):383–407, 2019.
- Convex Biclustering. Biometrics, 73(1):10–19, 2017.
- Fast Tree Inference with Weighted Fusion Penalties. Journal of Computational and Graphical Statistics, 26(1):205–216, 2017.
- A. Cordero and J. R. Torregrosa. Variants of Newton’s Method Using Fifth-Order Quadrature Formulas. Applied Mathematics and Computation, 190(1):686–698, 2007.
- L. Deng. The MNIST Database of Handwritten Digit Images for Machine Learning Research. IEEE Signal Processing Magazine, 29(6):141–142, 2012.
- D. Dua and C. Graff. UCI Machine Learning Repository, 2019. URL http://archive.ics.uci.edu/ml.
- D. Eddelbuettel. Seamless R and C++ Integration with Rcpp. Springer, 2013.
- D. Eddelbuettel and J. J. Balamuta. Extending R with C++: A Brief Introduction to Rcpp. PeerJ Preprints, 5:e3188v1, 2017.
- D. Eddelbuettel and R. François. Rcpp: Seamless R and C++ Integration. Journal of Statistical Software, 40(8):1–18, 2011.
- A Parallel ADMM-based Convex Clustering Method. EURASIP Journal on Advances in Signal Processing, 2022(1):1–33, 2022.
- Data Clustering: Theory, Algorithms, and Applications. SIAM, 2007.
- Applications of the Modified Leverrier-Faddeev Algorithm for the Construction of Explicit Matrix Spectral Decompositions and Inverses. Utilitas Mathematica, 40:51–64, 1991.
- Global Optimization in Least Squares Multidimensional Scaling by Distance Smoothing. Journal of Classification, 16(2):225–254, 1999.
- G. Guennebaud and B. Jacob. Eigen v3. http://eigen.tuxfamily.org, 2010.
- Pairwise Variable Selection for High-dimensional Model-based Clustering. Biometrics, 66(3):793–804, 2010.
- T. F. Havel. An Evaluation of Computational Strategies for Use in the Determination of Protein Structure from Distance Constrains Obtained by Nuclear Magnetic Resonance. Progress in Biophysics and Molecular Research, 56(1):43–78, 1991.
- Clusterpath: An Algorithm for Clustering Using Convex Fusion Penalties. In The 28th International Conference on Machine Learning, Bellevue, Washington, 2011.
- A Tutorial on MM Algorithms. The American Statistician, 58(1):30–37, 2004.
- pybind11—Seamless operability between C++11 and Python, 2017. https://github.com/pybind/pybind11.
- J. B. Kruskal. On the Shortest Spanning Subtree of a Graph and the Traveling Salesman Problem. Proceedings of the American Mathematical society, 7(1):48–50, 1956.
- K. Lange. Optimization, volume 95. Springer Science & Business Media, 2013.
- Optimization Transfer Using Surrogate Objective Functions. Journal of Computational and Graphical Statistics, 9(1):1–20, 2000.
- J. de Leeuw. Applications of Convex Analysis to Multidimensional Scaling. In J.-R. Barra, F. cois Brodeau, G. Romier, and B. van Cutsem, editors, Recent Developments in Statistics, pages 133–146. North Holland Publishing Company, 1977.
- J. de Leeuw. Convergence of the Majorization Method for Multidimensional Scaling. Journal of classification, 5:163–180, 1988.
- J. de Leeuw. Block-Relaxation Algorithms in Statistics. In H.-H. Bock, W. Lenski, and M. M. Richter, editors, Information Systems and Data Analysis, pages 308–324. Springer Berlin Heidelberg, 1994.
- J. de Leeuw and W. J. Heiser. Multidimensional Scaling with Restrictions on the Configuration. Multivariate Analysis, 5(1):501–522, 1980.
- Just Relax and Come Clustering!: A Convexification of K-Means Clustering. Technical report, Department of Electrical Engineering, Linköping University, Linköping, Sweden, 2011.
- J. MacQueen. Some Methods for Classification and Analysis of Multivariate Observations. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, pages 281–297, 1967.
- J. Mairal. Stochastic Majorization-Minimization Algorithms for Large-scale Optimization. Advances in Neural Information Processing Systems, 26, 2013.
- J. Mairal. Incremental Majorization-Minimization Optimization with Application to Large-scale Machine Learning. SIAM Journal on Optimization, 25(2):829–855, 2015.
- Y. Marchetti and Q. Zhou. Solution Path Clustering with Adaptive Concave Penalty. Electronic Journal of Statistics, 8(1):1569 – 1603, 2014.
- UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction, 2018. URL https://arxiv.org/abs/1802.03426.
- Iterative Solution of Nonlinear Equations in Several Variables. Academic Press, 1970.
- Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research, 12:2825–2830, 2011.
- Convex Clustering Shrinkage. In PASCAL Workshop on Statistics and Optimization of Clustering Workshop, 2005.
- P. Radchenko and G. Mukherjee. Convex Clustering via ℓ1subscriptℓ1\ell_{1}roman_ℓ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT Fusion Penalization. Journal of the Royal Statistical Society Series B: Statistical Methodology, 79(5):1527–1546, 2017.
- A Unified Convergence Analysis of Block Successive Minimization Methods for Nonsmooth Optimization. SIAM Journal on Optimization, 23(2):1126–1153, 2013.
- Y. She. Sparse Regression with Exact Clustering. Stanford University, 2008.
- Convex Clustering: Model, Theoretical Guarantee and Efficient Algorithm. Journal of Machine Learning Research, 22(9):1–32, 2021.
- H. Voß and U. Eckhardt. Linear Convergence of Generalized Weiszfeld’s Method. Computing, 25(3):243–251, 1980.
- E. Weiszfeld. Sur le Point pour lequel la Somme des Distances de n𝑛nitalic_n Points Donnés Est Minimum. Tohoku Mathemacial Journal, 43:355–386, 1937.
- Dynamic Visualization and Fast Computation for Convex Clustering via Algorithmic Regularization. Journal of Computational and Graphical Statistics, 29(1):87–96, 2020.
- Relaxed Majorization-Minimization for Non-smooth and Non-convex Optimization. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 30, 2016.
- COBRAC: A Fast Implementation of Convex Biclustering with Compression. Bioinformatics, 37(20):3667–3669, 2021.
- M. Yuan and Y. Lin. Model Selection and Estimation in Regression with Grouped Variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 68(1):49–67, 2006.
- A. L. Yuille and A. Rangarajan. The Concave-Convex Procedure. Neural Computation, 15(4):915–936, 2003.
- Scalable algorithms for convex clustering. In 2021 IEEE Data Science and Learning Workshop (DSLW), pages 1–6, 2021.