Stable Cluster Discrimination for Deep Clustering (2311.14310v1)
Abstract: Deep clustering can optimize representations of instances (i.e., representation learning) and explore the inherent data distribution (i.e., clustering) simultaneously, which demonstrates a superior performance over conventional clustering methods with given features. However, the coupled objective implies a trivial solution that all instances collapse to the uniform features. To tackle the challenge, a two-stage training strategy is developed for decoupling, where it introduces an additional pre-training stage for representation learning and then fine-tunes the obtained model for clustering. Meanwhile, one-stage methods are developed mainly for representation learning rather than clustering, where various constraints for cluster assignments are designed to avoid collapsing explicitly. Despite the success of these methods, an appropriate learning objective tailored for deep clustering has not been investigated sufficiently. In this work, we first show that the prevalent discrimination task in supervised learning is unstable for one-stage clustering due to the lack of ground-truth labels and positive instances for certain clusters in each mini-batch. To mitigate the issue, a novel stable cluster discrimination (SeCu) task is proposed and a new hardness-aware clustering criterion can be obtained accordingly. Moreover, a global entropy constraint for cluster assignments is studied with efficient optimization. Extensive experiments are conducted on benchmark data sets and ImageNet. SeCu achieves state-of-the-art performance on all of them, which demonstrates the effectiveness of one-stage deep clustering. Code is available at \url{https://github.com/idstcv/SeCu}.
- Self-labelling via simultaneous clustering and representation learning. In ICLR, 2020.
- Beit: BERT pre-training of image transformers. CoRR, abs/2106.08254, 2021.
- Convex optimization. Cambridge university press, 2004.
- Deep clustering for unsupervised learning of visual features. In ECCV, 2018.
- Unsupervised learning of visual features by contrasting cluster assignments. In NeurIPS, 2020.
- Emerging properties in self-supervised vision transformers. In ICCV, pages 9630–9640. IEEE, 2021.
- A simple framework for contrastive learning of visual representations. In ICML, volume 119, pages 1597–1607, 2020.
- Exploring simple siamese representation learning. In CVPR, pages 15750–15758. Computer Vision Foundation / IEEE, 2021.
- An empirical study of training self-supervised vision transformers. CoRR, abs/2104.02057, 2021.
- An analysis of single-layer networks in unsupervised feature learning. In AISTATS, volume 15, pages 215–223. JMLR.org, 2011.
- Nearest neighbor matching for deep clustering. In CVPR, pages 13693–13702. Computer Vision Foundation / IEEE, 2021.
- An image is worth 16x16 words: Transformers for image recognition at scale. In ICLR. OpenReview.net, 2021.
- Sparse subspace clustering: Algorithm, theory, and applications. IEEE Trans. Pattern Anal. Mach. Intell., 35(11):2765–2781, 2013.
- SCAN: learning to classify images without labels. In ECCV, volume 12355, pages 268–285. Springer, 2020.
- Bootstrap your own latent - A new approach to self-supervised learning. In NeurIPS, 2020.
- Masked autoencoders are scalable vision learners. CoRR, abs/2111.06377, 2021.
- Momentum contrast for unsupervised visual representation learning. In CVPR, pages 9726–9735, 2020.
- Deep residual learning for image recognition. In CVPR, pages 770–778, 2016.
- Deep semantic clustering by partition confidence maximisation. In CVPR, pages 8846–8855. Computer Vision Foundation / IEEE, 2020.
- Invariant information clustering for unsupervised image classification and segmentation. In ICCV, pages 9864–9873. IEEE, 2019.
- Learning multiple layers of features from tiny images. 2009.
- Imagenet classification with deep convolutional neural networks. In NeurIPS, pages 1106–1114, 2012.
- Contrastive clustering. In AAAI, pages 8547–8555. AAAI Press, 2021.
- Stuart P. Lloyd. Least squares quantization in PCM. IEEE Trans. Inf. Theory, 28(2):129–136, 1982.
- Softtriple loss: Deep metric learning without triplet sampling. In ICCV, pages 6449–6457, 2019.
- Unsupervised visual representation learning by online constrained k-means. In CVPR, pages 16619–16628. IEEE, 2022.
- Imagenet large scale visual recognition challenge. Int. J. Comput. Vis., 115(3):211–252, 2015.
- Ulrike von Luxburg. A tutorial on spectral clustering. Stat. Comput., 17(4):395–416, 2007.
- Unsupervised deep embedding for clustering analysis. In ICML, volume 48, pages 478–487. JMLR.org, 2016.
- Towards k-means-friendly spaces: Simultaneous deep learning and clustering. In ICML, 2017.
- Scaling SGD batch size to 32k for imagenet training. CoRR, abs/1708.03888, 2017.
- Barlow twins: Self-supervised learning via redundancy reduction. In ICML, volume 139, pages 12310–12320. PMLR, 2021.
- Graph contrastive clustering. In ICCV, pages 9204–9213. IEEE, 2021.
- Qi Qian (54 papers)