Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
126 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

SP$^2$OT: Semantic-Regularized Progressive Partial Optimal Transport for Imbalanced Clustering (2404.03446v1)

Published 4 Apr 2024 in cs.CV and cs.LG

Abstract: Deep clustering, which learns representation and semantic clustering without labels information, poses a great challenge for deep learning-based approaches. Despite significant progress in recent years, most existing methods focus on uniformly distributed datasets, significantly limiting the practical applicability of their methods. In this paper, we propose a more practical problem setting named deep imbalanced clustering, where the underlying classes exhibit an imbalance distribution. To address this challenge, we introduce a novel optimal transport-based pseudo-label learning framework. Our framework formulates pseudo-label generation as a Semantic-regularized Progressive Partial Optimal Transport (SP$2$OT) problem, which progressively transports each sample to imbalanced clusters under several prior distribution and semantic relation constraints, thus generating high-quality and imbalance-aware pseudo-labels. To solve SP$2$OT, we develop a Majorization-Minimization-based optimization algorithm. To be more precise, we employ the strategy of majorization to reformulate the SP$2$OT problem into a Progressive Partial Optimal Transport problem, which can be transformed into an unbalanced optimal transport problem with augmented constraints and can be solved efficiently by a fast matrix scaling algorithm. Experiments on various datasets, including a human-curated long-tailed CIFAR100, challenging ImageNet-R, and large-scale subsets of fine-grained iNaturalist2018 datasets, demonstrate the superiority of our method.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (74)
  1. X. Ji, J. F. Henriques, and A. Vedaldi, “Invariant information clustering for unsupervised image classification and segmentation,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 9865–9874.
  2. W. Van Gansbeke, S. Vandenhende, S. Georgoulis, M. Proesmans, and L. Van Gool, “Scan: Learning to classify images without labels,” in Proceedings of the European Conference on Computer Vision, 2020.
  3. M. Ronen, S. E. Finder, and O. Freifeld, “Deepdpm: Deep clustering with an unknown number of clusters,” in Conference on Computer Vision and Pattern Recognition, 2022.
  4. Z. Dang, C. Deng, X. Yang, K. Wei, and H. Huang, “Nearest neighbor matching for deep clustering,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2021, pp. 13 693–13 702.
  5. Z. Jiang, T. Chen, B. J. Mortazavi, and Z. Wang, “Self-damaging contrastive learning,” in International Conference on Machine Learning.   PMLR, 2021, pp. 4927–4939.
  6. Z. Zhou, J. Yao, Y.-F. Wang, B. Han, and Y. Zhang, “Contrastive learning with boosted memorization,” in International Conference on Machine Learning.   PMLR, 2022, pp. 27 367–27 377.
  7. C. Niu, H. Shan, and G. Wang, “Spice: Semantic pseudo-labeling for image clustering,” IEEE Transactions on Image Processing, vol. 31, pp. 7264–7278, 2022.
  8. E. Arazo, D. Ortego, P. Albert, N. E. O’Connor, and K. McGuinness, “Pseudo-labeling and confirmation bias in deep semi-supervised learning,” in 2020 International Joint Conference on Neural Networks (IJCNN).   IEEE, 2020, pp. 1–8.
  9. B. Kang, S. Xie, M. Rohrbach, Z. Yan, A. Gordo, J. Feng, and Y. Kalantidis, “Decoupling representation and classifier for long-tailed recognition,” in Eighth International Conference on Learning Representations (ICLR), 2020.
  10. C. Zhang, H. Ren, and X. He, “P22{}^{2}start_FLOATSUPERSCRIPT 2 end_FLOATSUPERSCRIPTot: Progressive partial optimal transport for deep imbalanced clustering,” 2024.
  11. A. Krizhevsky, G. Hinton et al., “Learning multiple layers of features from tiny images,” 2009.
  12. D. Hendrycks, S. Basart, N. Mu, S. Kadavath, F. Wang, E. Dorundo, R. Desai, T. Zhu, S. Parajuli, M. Guo, D. Song, J. Steinhardt, and J. Gilmer, “The many faces of robustness: A critical analysis of out-of-distribution generalization,” ICCV, 2021.
  13. G. Van Horn, O. Mac Aodha, Y. Song, Y. Cui, C. Sun, A. Shepard, H. Adam, P. Perona, and S. Belongie, “The inaturalist species classification and detection dataset,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 8769–8778.
  14. S. Zhou, H. Xu, Z. Zheng, J. Chen, J. Bu, J. Wu, X. Wang, W. Zhu, M. Ester et al., “A comprehensive survey on deep clustering: Taxonomy, challenges, and future directions,” arXiv preprint arXiv:2206.07579, 2022.
  15. Z. Huang, J. Chen, J. Zhang, and H. Shan, “Learning representation for clustering via prototype scattering and positive sampling,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022.
  16. J. Chang, L. Wang, G. Meng, S. Xiang, and C. Pan, “Deep adaptive image clustering,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 5879–5887.
  17. Y. Tao, K. Takagi, and K. Nakata, “Clustering-friendly representation learning via instance discrimination and feature decorrelation,” in International Conference on Learning Representations, 2020.
  18. Y. Li, P. Hu, Z. Liu, D. Peng, J. T. Zhou, and X. Peng, “Contrastive clustering,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no. 10, 2021, pp. 8547–8555.
  19. Y. Shen, Z. Shen, M. Wang, J. Qin, P. Torr, and L. Shao, “You never cluster alone,” Advances in Neural Information Processing Systems, vol. 34, pp. 27 734–27 746, 2021.
  20. I. Ben-Shaul, R. Shwartz-Ziv, T. Galanti, S. Dekel, and Y. LeCun, “Reverse engineering self-supervised learning,” arXiv preprint arXiv:2305.15614, 2023.
  21. D.-H. Lee et al., “Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks,” in Workshop on challenges in representation learning, ICML, vol. 3, no. 2.   Atlanta, 2013, p. 896.
  22. K. Sohn, D. Berthelot, N. Carlini, Z. Zhang, H. Zhang, C. A. Raffel, E. D. Cubuk, A. Kurakin, and C.-L. Li, “Fixmatch: Simplifying semi-supervised learning with consistency and confidence,” Advances in Neural Information Processing Systems, vol. 33, pp. 596–608, 2020.
  23. B. Zhang, Y. Wang, W. Hou, H. Wu, J. Wang, M. Okumura, and T. Shinozaki, “Flexmatch: Boosting semi-supervised learning with curriculum pseudo labeling,” Advances in Neural Information Processing Systems, vol. 34, pp. 18 408–18 419, 2021.
  24. M. Caron, P. Bojanowski, A. Joulin, and M. Douze, “Deep clustering for unsupervised learning of visual features,” in Proceedings of the European conference on computer vision (ECCV), 2018, pp. 132–149.
  25. Y. M. Asano, C. Rupprecht, and A. Vedaldi, “Self-labelling via simultaneous clustering and representation learning,” in International Conference on Learning Representations (ICLR), 2020.
  26. Y. Bengio, A. Courville, and P. Vincent, “Representation learning: A review and new perspectives,” IEEE transactions on pattern analysis and machine intelligence, vol. 35, no. 8, pp. 1798–1828, 2013.
  27. L. Jing and Y. Tian, “Self-supervised visual feature learning with deep neural networks: A survey,” IEEE transactions on pattern analysis and machine intelligence, vol. 43, no. 11, pp. 4037–4058, 2020.
  28. T. Chen, S. Kornblith, M. Norouzi, and G. Hinton, “A simple framework for contrastive learning of visual representations,” in International conference on machine learning.   PMLR, 2020, pp. 1597–1607.
  29. K. He, H. Fan, Y. Wu, S. Xie, and R. Girshick, “Momentum contrast for unsupervised visual representation learning,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 9729–9738.
  30. X. Wang and G.-J. Qi, “Contrastive learning with stronger augmentations,” IEEE transactions on pattern analysis and machine intelligence, vol. 45, no. 5, pp. 5549–5560, 2022.
  31. H. Xu, X. Zhang, H. Li, L. Xie, W. Dai, H. Xiong, and Q. Tian, “Seed the views: Hierarchical semantic alignment for contrastive representation learning,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 3, pp. 3753–3767, 2022.
  32. M. Caron, H. Touvron, I. Misra, H. Jégou, J. Mairal, P. Bojanowski, and A. Joulin, “Emerging properties in self-supervised vision transformers,” in Proceedings of the IEEE/CVF international conference on computer vision, 2021, pp. 9650–9660.
  33. X. Chen and K. He, “Exploring simple siamese representation learning,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021, pp. 15 750–15 758.
  34. J.-B. Grill, F. Strub, F. Altché, C. Tallec, P. Richemond, E. Buchatskaya, C. Doersch, B. Avila Pires, Z. Guo, M. Gheshlaghi Azar et al., “Bootstrap your own latent-a new approach to self-supervised learning,” Advances in neural information processing systems, vol. 33, pp. 21 271–21 284, 2020.
  35. Z. Huang, X. Jin, C. Lu, Q. Hou, M.-M. Cheng, D. Fu, X. Shen, and J. Feng, “Contrastive masked autoencoders are stronger vision learners,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023.
  36. K. He, X. Chen, S. Xie, Y. Li, P. Dollár, and R. Girshick, “Masked autoencoders are scalable vision learners,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 16 000–16 009.
  37. I. Radosavovic, T. Xiao, S. James, P. Abbeel, J. Malik, and T. Darrell, “Real-world robot learning with masked visual pre-training,” in Conference on Robot Learning.   PMLR, 2023, pp. 416–426.
  38. R. Balestriero, M. Ibrahim, V. Sobal, A. Morcos, S. Shekhar, T. Goldstein, F. Bordes, A. Bardes, G. Mialon, Y. Tian et al., “A cookbook of self-supervised learning,” arXiv preprint arXiv:2304.12210, 2023.
  39. H. Liu, J. Z. HaoChen, A. Gaidon, and T. Ma, “Self-supervised learning is more robust to dataset imbalance,” 2022.
  40. K. Cao, C. Wei, A. Gaidon, N. Arechiga, and T. Ma, “Learning imbalanced datasets with label-distribution-aware margin loss,” in Advances in Neural Information Processing Systems, 2019.
  41. S. Zhang, Z. Li, S. Yan, X. He, and J. Sun, “Distribution alignment: A unified framework for long-tail visual recognition,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2021, pp. 2361–2370.
  42. A. K. Menon, S. Jayasumana, A. S. Rawat, H. Jain, A. Veit, and S. Kumar, “Long-tail learning via logit adjustment,” arXiv preprint arXiv:2007.07314, 2020.
  43. Y. Zhang, B. Kang, B. Hooi, S. Yan, and J. Feng, “Deep long-tailed learning: A survey,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023.
  44. G. Peyré, M. Cuturi et al., “Computational optimal transport,” Center for Research in Economics and Statistics Working Papers, no. 2017-86, 2017.
  45. A. Khamis, R. Tsuchida, M. Tarek, V. Rolland, and L. Petersson, “Earth movers in the big data era: A review of optimal transport in machine learning,” arXiv preprint arXiv:2305.05080, 2023.
  46. N. Lahn, S. Raghvendra, and K. Zhang, “A combinatorial algorithm for approximating the optimal transport in the parallel and mpc settings,” Advances in Neural Information Processing Systems, vol. 36, 2024.
  47. A. Phatak, S. Raghvendra, C. Tripathy, and K. Zhang, “Computing all optimal partial transports,” in The Eleventh International Conference on Learning Representations, 2022.
  48. C. Frogner, C. Zhang, H. Mobahi, M. Araya, and T. A. Poggio, “Learning with a wasserstein loss,” Advances in neural information processing systems, vol. 28, 2015.
  49. I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, and A. C. Courville, “Improved training of wasserstein gans,” Advances in neural information processing systems, vol. 30, 2017.
  50. K. S. Tai, P. D. Bailis, and G. Valiant, “Sinkhorn label allocation: Semi-supervised classification via annealed self-training,” in International Conference on Machine Learning.   PMLR, 2021, pp. 10 065–10 075.
  51. F. Taherkhani, A. Dabouei, S. Soleymani, J. Dawson, and N. M. Nasrabadi, “Transporting labels via hierarchical optimal transport for semi-supervised learning,” in Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part IV 16.   Springer, 2020, pp. 509–526.
  52. M. Caron, I. Misra, J. Mairal, P. Goyal, P. Bojanowski, and A. Joulin, “Unsupervised learning of visual features by contrasting cluster assignments,” Advances in Neural Information Processing Systems, vol. 33, pp. 9912–9924, 2020.
  53. R. Flamary, N. Courty, D. Tuia, and A. Rakotomamonjy, “Optimal transport for domain adaptation,” IEEE Trans. Pattern Anal. Mach. Intell, vol. 1, pp. 1–40, 2016.
  54. W. Chang, Y. Shi, H. Tuan, and J. Wang, “Unified optimal transport framework for universal domain adaptation,” Advances in Neural Information Processing Systems, vol. 35, pp. 29 512–29 524, 2022.
  55. Y. Liu, Z. Zhou, and B. Sun, “Cot: Unsupervised domain adaptation with clustering and optimal transport,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2023, pp. 19 998–20 007.
  56. W. Chang, Y. Shi, and J. Wang, “Csot: Curriculum and structure-aware optimal transport for learning with noisy labels,” in Thirty-seventh Conference on Neural Information Processing Systems, 2023.
  57. W. Wang, F. Wen, Z. Yan, and P. Liu, “Optimal transport for unsupervised denoising learning,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 2, pp. 2104–2118, 2022.
  58. D. Luo, H. Xu, and L. Carin, “Differentiable hierarchical optimal transport for robust multi-view learning,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022.
  59. C. Zhang, R. Xu, and X. He, “Novel class discovery for long-tailed recognition,” Transactions on Machine Learning Research, 2023.
  60. L. Kantorovich, “On the transfer of masses (in russian),” Doklady Akademii Nauk, vol. 37, pp. 227–229, 1942.
  61. M. Liero, A. Mielke, and G. Savaré, “Optimal entropy-transport problems and a new hellinger–kantorovich distance between positive measures,” Inventiones mathematicae, vol. 211, no. 3, pp. 969–1117, 2018.
  62. M. Cuturi, “Sinkhorn distances: Lightspeed computation of optimal transport,” Advances in neural information processing systems, vol. 26, 2013.
  63. P. A. Knight, “The sinkhorn–knopp algorithm: convergence and applications,” SIAM Journal on Matrix Analysis and Applications, vol. 30, no. 1, pp. 261–275, 2008.
  64. L. Chizat, G. Peyré, B. Schmitzer, and F.-X. Vialard, “Scaling algorithms for unbalanced optimal transport problems,” Mathematics of Computation, vol. 87, no. 314, pp. 2563–2609, 2018.
  65. X. Wang, Y. Chen, and W. Zhu, “A survey on curriculum learning,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 9, pp. 4555–4576, 2021.
  66. T. A. Samuli Laine, “Temporal ensembling for semi-supervised learning,” International Conference on Learning Representations (ICLR), vol. 30, 2017.
  67. A. Tarvainen and H. Valpola, “Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results,” Advances in neural information processing systems, vol. 30, 2017.
  68. L. A. Caffarelli and R. J. McCann, “Free boundaries in optimal transport and monge-ampere obstacle problems,” Annals of mathematics, pp. 673–730, 2010.
  69. L. Chapel, M. Z. Alaya, and G. Gasso, “Partial optimal tranport with applications on positive-unlabeled learning,” Advances in Neural Information Processing Systems, vol. 33, pp. 2903–2913, 2020.
  70. S. Zhang, Z. Li, S. Yan, X. He, and J. Sun, “Distribution alignment: A unified framework for long-tail visual recognition,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021, pp. 2361–2370.
  71. A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly et al., “An image is worth 16x16 words: Transformers for image recognition at scale,” in International Conference on Learning Representations, 2020.
  72. J. Huang, S. Gong, and X. Zhu, “Deep semantic clustering by partition confidence maximisation,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
  73. I. M. Metaxas, G. Tzimiropoulos, and I. Patras, “Divclust: Controlling diversity in deep clustering,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 3418–3428.
  74. L. Van der Maaten and G. Hinton, “Visualizing data using t-sne.” Journal of machine learning research, vol. 9, no. 11, 2008.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets