Kernel Correlation-Dissimilarity for Multiple Kernel k-Means Clustering (2403.03448v1)
Abstract: The main objective of the Multiple Kernel k-Means (MKKM) algorithm is to extract non-linear information and achieve optimal clustering by optimizing base kernel matrices. Current methods enhance information diversity and reduce redundancy by exploiting interdependencies among multiple kernels based on correlations or dissimilarities. Nevertheless, relying solely on a single metric, such as correlation or dissimilarity, to define kernel relationships introduces bias and incomplete characterization. Consequently, this limitation hinders efficient information extraction, ultimately compromising clustering performance. To tackle this challenge, we introduce a novel method that systematically integrates both kernel correlation and dissimilarity. Our approach comprehensively captures kernel relationships, facilitating more efficient classification information extraction and improving clustering performance. By emphasizing the coherence between kernel correlation and dissimilarity, our method offers a more objective and transparent strategy for extracting non-linear information and significantly improving clustering precision, supported by theoretical rationale. We assess the performance of our algorithm on 13 challenging benchmark datasets, demonstrating its superiority over contemporary state-of-the-art MKKM techniques.
- Impact of metrics on biclustering solution and quality: A review, Pattern Recognition 127 (2022) 108612. doi:10.1016/j.patcog.2022.108612.
- Deep spectral clustering using dual autoencoder network, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 4061–4070. doi:10.1109/CVPR.2019.00419.
- K. Sugahara, K. Okamoto, Hierarchical co-clustering with augmented matrices from external domains, Pattern Recognition 142 (2023) 109657. doi:10.1016/j.patcog.2023.109657.
- J. A. Hartigan, M. A. Wong, Algorithm as 136: A k-means clustering algorithm, Journal of the Royal Statistical Society. Series C (Applied Statistics) 28 (1979) 100–108. doi:10.2307/2346830.
- On spectral clustering: Analysis and an algorithm, in: Proceedings of the Advances in Neural Information Processing Systems, 2001, p. 849–856. URL: https://proceedings.neurips.cc/paper_files/paper/2001/file/801272ee79cfde7fa5960571fee36b9b-Paper.pdf.
- Maximum margin clustering, in: Proceedings of the Advances in Neural Information Processing Systems, 2004, p. 1537–1544. URL: https://proceedings.neurips.cc/paper_files/paper/2004/file/6403675579f6114559c90de0014cd3d6-Paper.pdf.
- A density-based algorithm for discovering clusters in large spatial databases with noise, in: Proceedings of the International Conference on Knowledge Discovery and Data Mining, 1996, p. 226–231. URL: https://cdn.aaai.org/KDD/1996/KDD96-037.pdf.
- J. MacQueen, Some methods for classification and analysis of multivariate observations, in: Proceedings of the Berkeley Symposium on Mathematical Statistics and Probability, volume 1, 1967, pp. 281–297.
- K. P. Sinaga, M.-S. Yang, Unsupervised k-means clustering algorithm, IEEE Access 8 (2020) 80716–80727. doi:10.1109/ACCESS.2020.2988796.
- Robust deep k-means: An effective and simple method for data clustering, Pattern Recognition 117 (2021) 107996. doi:10.1016/j.patcog.2021.107996.
- D. Arthur, S. Vassilvitskii, K-means++: The advantages of careful seeding, in: Proceedings of the Annual ACM-SIAM Symposium on Discrete Algorithms, 2007, p. 1027–1035.
- How to use k-means for big data clustering?, Pattern Recognition 137 (2023) 109269. doi:10.1016/j.patcog.2022.109269.
- Adversarial learning for robust deep clustering, in: Proceedings of the Advances in Neural Information Processing Systems, 2020, pp. 9098–9108. URL: https://proceedings.neurips.cc/paper_files/paper/2020/file/6740526b78c0b230e41ae61d8ca07cf5-Paper.pdf.
- Auto-attention mechanism for multi-view deep embedding clustering, Pattern Recognition 143 (2023) 109764. doi:10.1016/j.patcog.2023.109764.
- Deep multiview collaborative clustering, IEEE Transactions on Neural Networks and Learning Systems 34 (2023) 516–526. doi:10.1109/TNNLS.2021.3097748.
- Towards deeper match for multi-view oriented multiple kernel learning, Pattern Recognition 134 (2023) 109119. doi:10.1016/j.patcog.2022.109119.
- Optimized data fusion for kernel k-means clustering, IEEE Transactions on Pattern Analysis & Machine Intelligence 34 (2011) 1031–1039. doi:10.1109/TPAMI.2011.255.
- M. Girolami, Mercer kernel-based clustering in feature space, IEEE Transactions on Neural Networks 13 (2002) 780–784. doi:10.1109/TNN.2002.1000150.
- A. A. Abin, H. Beigy, Active constrained fuzzy clustering: A multiple kernels learning approach, Pattern Recognition 48 (2015) 953–967. doi:10.1016/j.patcog.2014.09.008.
- M. R. Ferreira, F. d. A. de Carvalho, Kernel-based hard clustering methods in the feature space with automatic variable weighting, Pattern Recognition 47 (2014) 3082–3095. doi:10.1016/j.patcog.2014.03.026.
- A survey on multiview clustering, IEEE Transactions on Artificial Intelligence 2 (2021) 146–168. doi:10.1109/TAI.2021.3065894.
- Learning from multiple partially observed views -an application to multilingual text categorization, in: Proceedings of the Advances in Neural Information Processing Systems, 2009, p. 28–36. URL: https://proceedings.neurips.cc/paper_files/paper/2009/file/f79921bbae40a577928b76d2fc3edc2a-Paper.pdf.
- S. Bickel, T. Scheffer, Multi-view clustering, in: Proceedings of the IEEE International Conference on Data Mining, 2004, pp. 19–26. doi:10.1109/ICDM.2004.10095.
- A survey on multi-view learning, arXiv preprint arXiv:1304.5634 (2013). doi:10.48550/arXiv.1304.5634.
- Multiple kernel learning, conic duality, and the smo algorithm, in: Proceedings of the International Conference on Machine Learning, 2004, p. 6. doi:10.1145/1015330.1015424.
- A. Zien, C. S. Ong, Multiclass multiple kernel learning, in: Proceedings of the International Conference on Machine Learning, volume 227, 2007, pp. 1191–1198. doi:10.1145/1273496.1273646.
- Multiple kernel learning based multi-view spectral clustering, in: Proceedings of the International Conference on Pattern Recognition, 2014, pp. 3774–3779. doi:10.1109/ICPR.2014.648.
- Multiple kernel fuzzy clustering, IEEE Transactions on Fuzzy Systems 20 (2012) 120–134. doi:10.1109/TFUZZ.2011.2170175.
- Multiple kernel clustering based on centered kernel alignment, Pattern Recognition 47 (2014) 3656–3664. doi:10.1016/j.patcog.2014.05.005.
- M. Gönen, A. A. Margolin, Localized data fusion for kernel k-means clustering with application to cancer biology, in: Proceedings of the Advances in Neural Information Processing Systems, 2014, p. 1305–1313. URL: https://proceedings.neurips.cc/paper_files/paper/2014/file/6c29793a140a811d0c45ce03c1c93a28-Paper.pdf.
- Robust multiple kernel k-means using l21subscript𝑙21l_{21}italic_l start_POSTSUBSCRIPT 21 end_POSTSUBSCRIPT-norm, in: Proceedings of the International Joint Conference on Artificial Intelligence, 2015, p. 3476–3482. URL: https://www.ijcai.org/Proceedings/15/Papers/489.pdf.
- Multiple kernel k-means clustering with matrix-induced regularization, in: Proceedings of the AAAI Conference on Artificial Intelligence, volume 30, 2016, p. 1888–1894. doi:10.1609/aaai.v30i1.10249.
- Multiple kernel k-means clustering by selecting representative kernels, IEEE Transactions on Neural Networks and Learning Systems 32 (2021) 4983–4996. doi:10.1109/TNNLS.2020.3026532.
- X. Liu, Simplemkkm: Simple multiple kernel k-means, IEEE Transactions on Pattern Analysis and Machine Intelligence 45 (2023) 5174–5186. doi:10.1109/TPAMI.2022.3198638.
- C. Bauckhage, k-means clustering is matrix factorization, arXiv preprint arXiv:1512.07548 (2015). doi:10.48550/arXiv.1512.07548.
- X.-G. Liu, On rayleigh quotient matrices:theory and applications, Journal of Computational Mathematics 17 (1999) 629–638. URL: http://www.jstor.org/stable/43692821.
- Robust multiple kernel k-means clustering using min-max optimizationn, arXiv preprint arXiv:1803.02458 (2019). doi:10.48550/arXiv.1803.02458.
- J. Demšar, Statistical comparisons of classifiers over multiple data sets, Journal of Machine Learning Research 7 (2006) 1–30. URL: http://jmlr.org/papers/v7/demsar06a.html.
- Discrete and parameter-free multiple kernel k-means, IEEE Transactions on Image Processing 31 (2022) 2796–2808. doi:10.1109/TIP.2022.3141612.