Deep Clustering with Diffused Sampling and Hardness-aware Self-distillation (2401.14038v2)
Abstract: Deep clustering has gained significant attention due to its capability in learning clustering-friendly representations without labeled data. However, previous deep clustering methods tend to treat all samples equally, which neglect the variance in the latent distribution and the varying difficulty in classifying or clustering different samples. To address this, this paper proposes a novel end-to-end deep clustering method with diffused sampling and hardness-aware self-distillation (HaDis). Specifically, we first align one view of instances with another view via diffused sampling alignment (DSA), which helps improve the intra-cluster compactness. To alleviate the sampling bias, we present the hardness-aware self-distillation (HSD) mechanism to mine the hardest positive and negative samples and adaptively adjust their weights in a self-distillation fashion, which is able to deal with the potential imbalance in sample contributions during optimization. Further, the prototypical contrastive learning is incorporated to simultaneously enhance the inter-cluster separability and intra-cluster compactness. Experimental results on five challenging image datasets demonstrate the superior clustering performance of our HaDis method over the state-of-the-art. Source code is available at https://github.com/Regan-Zhang/HaDis.
- T. Chen, S. Kornblith, M. Norouzi, and G. Hinton, “A simple framework for contrastive learning of visual representations,” in Proc. of International Conference on Machine Learning (ICML), 2020.
- K. He, H. Fan, Y. Wu, S. Xie, and R. Girshick, “Momentum contrast for unsupervised visual representation learning,” in Proc. of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
- J.-B. Grill, F. Strub, F. Altché, C. Tallec, P. Richemond, E. Buchatskaya, C. Doersch, B. Avila Pires, Z. Guo, M. Gheshlaghi Azar et al., “Bootstrap your own latent-a new approach to self-supervised learning,” in Advanced in Neural Information Processing Systems (NeurIPS), 2020.
- X. Chen and K. He, “Exploring simple siamese representation learning,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021, pp. 15 750–15 758.
- M. Caron, I. Misra, J. Mairal, P. Goyal, P. Bojanowski, and A. Joulin, “Unsupervised learning of visual features by contrasting cluster assignments,” in Advanced in Neural Information Processing Systems (NeurIPS), 2020.
- W. Van Gansbeke, S. Vandenhende, S. Georgoulis, M. Proesmans, and L. Van Gool, “Scan: Learning to classify images without labels,” in Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part X. Springer, 2020, pp. 268–285.
- Y. Li, P. Hu, Z. Liu, D. Peng, J. T. Zhou, and X. Peng, “Contrastive clustering,” in Proc. of AAAI Conference on Artificial Intelligence (AAAI), 2021.
- Y. Tao, K. Takagi, and K. Nakata, “Clustering-friendly representation learning via instance discrimination and feature decorrelation,” arXiv preprint arXiv:2106.00131, 2021.
- C. Niu, H. Shan, and G. Wang, “Spice: Semantic pseudo-labeling for image clustering,” IEEE Transactions on Image Processing, vol. 31, pp. 7264–7278, 2022.
- X. Deng, D. Huang, D.-H. Chen, C.-D. Wang, and J.-H. Lai, “Strongly augmented contrastive clustering,” Pattern Recognition, vol. 139, p. 109470, 2023.
- J. Li, P. Zhou, C. Xiong, and S. C. Hoi, “Prototypical contrastive learning of unsupervised representations,” arXiv preprint arXiv:2005.04966, 2020.
- Z. Huang, J. Chen, J. Zhang, and H. Shan, “Learning representation for clustering via prototype scattering and positive sampling,” IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 1–16, 2022. [Online]. Available: https://doi.org/10.1109%2Ftpami.2022.3216454
- X. Deng, D. Huang, and C.-D. Wang, “Heterogeneous tri-stream clustering network,” Neural Processing Letters, pp. 1–14, 2023.
- C.-Y. Chuang, J. Robinson, Y.-C. Lin, A. Torralba, and S. Jegelka, “Debiased contrastive learning,” Advances in neural information processing systems, vol. 33, pp. 8765–8775, 2020.
- H. Zhao, X. Yang, Z. Wang, E. Yang, and C. Deng, “Graph debiased contrastive learning with joint representation clustering.” in IJCAI, 2021, pp. 3434–3440.
- F. Wang and H. Liu, “Understanding the behaviour of contrastive loss,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021, pp. 2495–2504.
- H. Liu and M. Ye, “Improving self-supervised lightweight model learning via hard-aware metric distillation,” in European Conference on Computer Vision. Springer, 2022, pp. 295–311.
- S. Abbasi Koohpayegani, A. Tejankar, and H. Pirsiavash, “Compress: Self-supervised learning by compressing representations,” Advances in Neural Information Processing Systems, vol. 33, pp. 12 980–12 992, 2020.
- L. Zelnik-Manor and P. Perona, “Self-tuning spectral clustering,” in Advanced in Neural Information Processing Systems (NeurIPS), 2005.
- K. C. Gowda and G. Krishna, “Agglomerative clustering using the concept of mutual nearest neighbourhood,” Pattern Recognition, vol. 10, no. 2, pp. 105–12, 1978.
- D. Cai, X. He, X. Wang, H. Bao, and J. Han, “Locality preserving nonnegative matrix factorization,” in Proc. of International Joint Conference on Artificial Intelligence (IJCAI), 2009.
- Y. Bengio, P. Lamblin, D. Popovici, and H. Larochelle, “Greedy layer-wise training of deep networks,” Advances in Neural Information Processing Systems, vol. 19, 2006.
- P. Vincent, H. Larochelle, I. Lajoie, Y. Bengio, P.-A. Manzagol, and L. Bottou, “Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion.” Journal of Machine Learning Research, vol. 11, no. 12, 2010.
- J. Xie, R. Girshick, and A. Farhadi, “Unsupervised deep embedding for clustering analysis,” in Proc. of International Conference on Machine Learning (ICML), 2016.
- J. Yang, D. Parikh, and D. Batra, “Joint unsupervised learning of deep representations and image clusters,” in IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 5147–5156.
- J. Huang, S. Gong, and X. Zhu, “Deep semantic clustering by partition confidence maximisation,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 8849–8858.
- Z. Wu, Y. Xiong, S. X. Yu, and D. Lin, “Unsupervised feature learning via non-parametric instance discrimination,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 3733–3742.
- A. v. d. Oord, Y. Li, and O. Vinyals, “Representation learning with contrastive predictive coding,” arXiv preprint arXiv:1807.03748, 2018.
- J. Robinson, C.-Y. Chuang, S. Sra, and S. Jegelka, “Contrastive learning with hard negative samples,” arXiv preprint arXiv:2010.04592, 2020.
- Y. Kalantidis, M. B. Sariyildiz, N. Pion, P. Weinzaepfel, and D. Larlus, “Hard negative mixing for contrastive learning,” Advances in Neural Information Processing Systems, vol. 33, pp. 21 798–21 809, 2020.
- J. Xia, L. Wu, G. Wang, J. Chen, and S. Z. Li, “Progcl: Rethinking hard negative mining in graph contrastive learning,” arXiv preprint arXiv:2110.02027, 2021.
- Y. Liu, X. Yang, S. Zhou, X. Liu, Z. Wang, K. Liang, W. Tu, L. Li, J. Duan, and C. Chen, “Hard sample aware network for contrastive deep graph clustering,” in Proceedings of the AAAI conference on artificial intelligence, vol. 37, no. 7, 2023, pp. 8914–8922.
- Y. Tian, X. Chen, and S. Ganguli, “Understanding self-supervised learning dynamics without contrastive pairs,” in International Conference on Machine Learning. PMLR, 2021, pp. 10 268–10 278.
- M. Ye, H. Li, B. Du, J. Shen, L. Shao, and S. C. Hoi, “Collaborative refining for person re-identification with label noise,” IEEE Transactions on Image Processing, vol. 31, pp. 379–391, 2021.
- A. Krizhevsky, G. Hinton et al., “Learning multiple layers of features from tiny images,” 2009.
- A. Coates, A. Ng, and H. Lee, “An analysis of single-layer networks in unsupervised feature learning,” in Proceedings of the fourteenth international conference on artificial intelligence and statistics. JMLR Workshop and Conference Proceedings, 2011, pp. 215–223.
- J. Chang, L. Wang, G. Meng, S. Xiang, and C. Pan, “Deep adaptive image clustering,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 5879–5887.
- J. MacQueen et al., “Some methods for classification and analysis of multivariate observations,” in Proc. of Mathematical Statistics and Probability, 1967.
- A. Radford, L. Metz, and S. Chintala, “Unsupervised representation learning with deep convolutional generative adversarial networks,” arXiv preprint arXiv:1511.06434, 2015.
- M. D. Zeiler, D. Krishnan, G. W. Taylor, and R. Fergus, “Deconvolutional networks,” in IEEE Conference on Computer Vision and Pattern Recognition, 2010, pp. 2528–2535.
- D. P. Kingma and M. Welling, “Auto-encoding variational bayes,” arXiv preprint arXiv:1312.6114, 2013.
- J. Wu, K. Long, F. Wang, C. Qian, C. Li, Z. Lin, and H. Zha, “Deep comprehensive correlation mining for image clustering,” in IEEE/CVF International Conference on Computer Vision, 2019, pp. 8150–8159.
- L. Van der Maaten and G. Hinton, “Visualizing data using t-sne,” Journal of machine learning research, vol. 9, no. 11, 2008.
- Hai-Xin Zhang (2 papers)
- Dong Huang (102 papers)