Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
167 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Self Supervised Correlation-based Permutations for Multi-View Clustering (2402.16383v1)

Published 26 Feb 2024 in cs.LG and stat.ML

Abstract: Fusing information from different modalities can enhance data analysis tasks, including clustering. However, existing multi-view clustering (MVC) solutions are limited to specific domains or rely on a suboptimal and computationally demanding two-stage procedure of representation and clustering. We propose an end-to-end deep learning-based MVC framework for general data (image, tabular, etc.). Our approach involves learning meaningful fused data representations with a novel permutation-based canonical correlation objective. Concurrently, we learn cluster assignments by identifying consistent pseudo-labels across multiple views. We demonstrate the effectiveness of our model using ten MVC benchmark datasets. Theoretically, we show that our model approximates the supervised linear discrimination analysis (LDA) representation. Additionally, we provide an error bound induced by false-pseudo label annotations.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (96)
  1. Deciphering cell–cell interactions and communication from gene expression. Nature Reviews Genetics, 22(2):71–88, 2021.
  2. Deepmcat: large-scale deep clustering for medical image categorization. In Deep Generative Models, and Data Augmentation, Labelling, and Imperfections: First Workshop, DGM4MICCAI 2021, and First Workshop, DALI 2021, Held in Conjunction with MICCAI 2021, Strasbourg, France, October 1, 2021, Proceedings 1, pages 259–267. Springer, 2021.
  3. Unsupervised clustering for collider physics. Physical Review D, 103(9):092007, 2021.
  4. Partitioning-based clustering for web document categorization. Decision Support Systems, 27(3):329–341, 1999.
  5. Anil K Jain. Data clustering: 50 years beyond k-means. Pattern recognition letters, 31(8):651–666, 2010.
  6. T Velmurugan and T Santhanam. A survey of partition based clustering algorithms in data mining: An experimental approach. Information Technology Journal, 10(3):478–484, 2011.
  7. Scalable density-based distributed clustering. In Knowledge Discovery in Databases: PKDD 2004: 8th European Conference on Principles and Practice of Knowledge Discovery in Databases, Pisa, Italy, September 20-24, 2004. Proceedings 8, pages 231–244. Springer, 2004.
  8. Density-based clustering of uncertain data. In Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, pages 672–677, 2005.
  9. Yixin Chen and Li Tu. Density-based clustering for real-time stream data. In Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 133–142, 2007.
  10. A local-density based spatial clustering algorithm with noise. Information systems, 32(7):978–986, 2007.
  11. Distribution-based clustering: using ecology to refine the operational taxonomic unit. Applied and environmental microbiology, 79(21):6593–6603, 2013.
  12. Clustering uncertain data based on probability distribution similarity. IEEE Transactions on Knowledge and Data Engineering, 25(4):751–763, 2011a.
  13. Fionn Murtagh. A survey of recent advances in hierarchical clustering algorithms. The computer journal, 26(4):354–359, 1983.
  14. Characterization, stability and convergence of hierarchical clustering methods. J. Mach. Learn. Res., 11(Apr):1425–1470, 2010.
  15. Interpretable deep clustering. arXiv preprint arXiv:2306.04785, 2023.
  16. Domain-generalizable multiple-domain clustering. arXiv preprint arXiv:2301.13530, 2023.
  17. A survey of multiview machine learning. Neurocomputing, 128:22–45, 2014.
  18. Co-regularized multi-view spectral clustering. Proceedings of the 28th international conference on machine learning (ICML-11), pages 521–528, 2011.
  19. Multiview clustering: A survey. In 2018 24th International Conference on Pattern Recognition (ICPR), pages 387–392. IEEE, 2018a.
  20. A self-training co-training algorithm for multiview spectral clustering. Pattern Recognition Letters, 33(13):1690–1700, 2012.
  21. A survey on multi-view learning. arXiv preprint arXiv:1304.5634, 2013.
  22. Multimodal clustering and content-based fusion for multimedia analysis. In 2017 IEEE International Conference on Multimedia and Expo (ICME), pages 1–6. IEEE, 2017.
  23. A survey of multi-view representation learning. Neurocomputing, 128:27–42, 2014.
  24. Diversified multi-view video recommendation. IEEE Transactions on Multimedia, 17(4):511–525, 2015a.
  25. Seismic event discrimination using deep cca. IEEE Geoscience and Remote Sensing Letters, 17(11):1856–1860, 2019.
  26. Multiview kernels for low-dimensional modeling of seismic events. IEEE Transactions on Geoscience and Remote Sensing, 56(6):3300–3310, 2018.
  27. Diversity-induced multi-view subspace clustering. In CVPR, pages 586–594, 2015b.
  28. Consistent and specific multi-view subspace clustering. In AAAI, 2018.
  29. Reciprocal multi-layer subspace learning for multi-view clustering. In ICCV, pages 8172–8180, 2019a.
  30. Multi-view clustering via deep matrix factorization. In AAAI, pages 2921–2927, 2017.
  31. Incomplete multi-view clustering via graph regularized matrix factorization. In ECCV Workshops, 2018.
  32. Uniform distribution non-negative matrix factorization for multiview clustering. IEEE Transactions on Cybernetics, pages 3249–3262, 2021.
  33. Self-weighted multiview clustering with multiple graphs. In IJCAI, pages 2564–2570, 2017.
  34. Graph learning for multiview clustering. IEEE Transactions on Cybernetics, 48(10):2887–2895, 2017.
  35. One-step multi-view spectral clustering. IEEE Transactions on Knowledge and Data Engineering, 31(10):2022–2034, 2018.
  36. Anchors bring ease: An embarrassingly simple approach to partial multi-view clustering. In AAAI, pages 118–125, 2019.
  37. Deep multimodal subspace clustering networks. IEEE Journal of Selected Topics in Signal Processing, 12(6):1601–1614, 2018.
  38. Self-supervised learning by cross-modal audio-video clustering. In NeurIPS, pages 9758–9770, 2019.
  39. Deep adversarial multi-view clustering network. In IJCAI, pages 2952–2958, 2019b.
  40. Shared generative latent representation learning for multi-view clustering. In AAAI, pages 6688–6695, 2020.
  41. CDIMC-net: Cognitive deep incomplete multi-view clustering network. In IJCAI, pages 3230–3236, 2020.
  42. Deep embedded multi-view clustering with collaborative training. Information Sciences, 573:279–290, 2021a.
  43. Multi-VAE: Learning disentangled view-common and view-peculiar visual representations for multi-view clustering. In ICCV, pages 9234–9243, 2021b.
  44. COMPLETER: Incomplete multi-view clustering via contrastive prediction. In CVPR, 2021.
  45. Latent multi-view subspace clustering. In CVPR, pages 4279–4287, 2017a.
  46. Multi-view low-rank sparse subspace clustering. Pattern Recognition, 73:247–258, 2018.
  47. Deep multi-view sparse subspace clustering. In Proceedings of the 2018 VII International Conference on Network, Communication and Computing, pages 115–119, 2018.
  48. Deep safe multi-view clustering: Reducing the risk of clustering performance degradation caused by view increase. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 202–211, 2022.
  49. Reconsidering representation alignment for multi-view clustering. In CVPR, pages 1255–1265, 2021a.
  50. Multi-view clustering in latent embedding space. In AAAI, pages 3513–3520, 2020.
  51. Deep multiview clustering by contrasting cluster assignments. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 16752–16761, October 2023.
  52. Ronald A Fisher. The use of multiple measurements in taxonomic problems. Annals of eugenics, 7(2):179–188, 1936.
  53. Self-supervised deep correlational multi-view clustering. In 2021 International Joint Conference on Neural Networks (IJCNN), pages 1–8. IEEE, 2021.
  54. Correlational neural networks. Neural computation, 28(2):257–285, 2016.
  55. Hotelling Harold. Relations between two sets of variables. Biometrika, 28(3):321–377, 1936.
  56. Bruce Thompson. Canonical correlation analysis: Uses and interpretation. Number 47. Sage, 1984.
  57. Kernel independent component analysis. Journal of machine learning research, 3(Jul):1–48, 2002.
  58. Nonparametric canonical correlation analysis. In International conference on machine learning, pages 1967–1976. PMLR, 2016.
  59. Learning coupled embedding using multiview diffusion maps. In Latent Variable Analysis and Signal Separation: 12th International Conference, LVA/ICA 2015, Liberec, Czech Republic, August 25-28, 2015, Proceedings 12, pages 127–134. Springer, 2015.
  60. Multi-view diffusion maps. Information Fusion, 55:127–149, 2020.
  61. Multi-view kernel consensus for data analysis. Applied and Computational Harmonic Analysis, 49(1):208–228, 2020.
  62. Deep canonical correlation analysis. In ICML, pages 1247–1255, 2013.
  63. L0-sparse canonical correlation analysis. In International Conference on Learning Representations, 2021.
  64. Gcfagg: Global and cross-view feature aggregation for multi-view clustering. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 19863–19872, 2023.
  65. Biclustering by sparse canonical correlation analysis. Quantitative Biology, 6:56–67, 2018.
  66. Assessment of mental stress effects on prefrontal cortical activities using canonical correlation analysis: an fnirs-eeg study. Biomedical optics express, 8(5):2583–2598, 2017.
  67. Sparse bayesian multiway canonical correlation analysis for eeg pattern recognition. Neurocomputing, 225:103–110, 2017b.
  68. Fault detection for non-gaussian processes using generalized canonical correlation analysis and randomized algorithms. IEEE Transactions on Industrial Electronics, 65(2):1559–1567, 2017.
  69. Deep clustering for unsupervised learning of visual features. In Proceedings of the European conference on computer vision (ECCV), pages 132–149, 2018.
  70. Unsupervised learning of visual features by contrasting cluster assignments. Advances in neural information processing systems, 33:9912–9924, 2020.
  71. Self-supervised autoencoders for clustering and classification. Evolving Systems, 11(3):453–466, 2020.
  72. Spice: Semantic pseudo-labeling for image clustering. IEEE Transactions on Image Processing, 31:7264–7278, 2022.
  73. Self-supervised adversarial hashing networks for cross-modal retrieval. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4242–4251, 2018b.
  74. On deep multi-view representation learning. In ICML, pages 1083–1092, 2015.
  75. Canonical correlation analysis using within-class coupling. Pattern Recognition Letters, 32(2):134–144, 2011.
  76. Multi-view clustering via canonical correlation analysis. In Proceedings of the 26th annual international conference on machine learning, pages 129–136, 2009.
  77. Deep generalized canonical correlation analysis. arXiv preprint arXiv:1702.02519, 2017.
  78. Qi Lyu and Xiao Fu. Nonlinear multiview analysis: Identifiability and neural network-assisted implementation. IEEE Transactions on Signal Processing, 68:2697–2712, 2020.
  79. Matrix perturbation theory. (No Title), 1990.
  80. Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747, 2017.
  81. Scalable and effective deep cca via soft decorrelation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1488–1497, 2018.
  82. Domain and modality adaptation using multi-kernel matching. In 2023 31st European Signal Processing Conference (EUSIPCO), pages 1285–1289. IEEE, 2023.
  83. The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups. Nature, 486(7403):346–352, 2012.
  84. A new genome-driven integrated classification of breast cancer and its implications. The EMBO journal, 32(5):617–628, 2013.
  85. Learning from multiple partially observed views-an application to multilingual text categorization. Advances in neural information processing systems, 22, 2009.
  86. The pascal visual object classes (voc) challenge. International journal of computer vision, 88:303–338, 2010.
  87. Reconsidering representation alignment for multi-view clustering. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 1255–1265, 2021b.
  88. Laurens Van der Maaten and Geoffrey Hinton. Visualizing data using t-sne. Journal of machine learning research, 9(11), 2008.
  89. Non-metric affinity propagation for unsupervised image categorization. In 2007 IEEE 11th international conference on computer vision, pages 1–8. IEEE, 2007.
  90. What are you talking about? text-to-image coreference. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3558–3565, 2014.
  91. End-to-end adversarial-attention network for multi-modal clustering. In CVPR, pages 14619–14628, 2020.
  92. Uci machine learning repository, 2007.
  93. Consumer video understanding: A benchmark database and an evaluation of human and machine performance. In ICMR, pages 1–8, 2011b.
  94. Multiview concept learning via deep matrix factorization. IEEE transactions on neural networks and learning systems, 32(2):814–825, 2020.
  95. Li Fei-Fei and Pietro Perona. A bayesian hierarchical model for learning natural scene categories. In 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), volume 2, pages 524–531. IEEE, 2005.
  96. Peter J Rousseeuw. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Journal of computational and applied mathematics, 20:53–65, 1987.
Citations (2)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets