Contrastive Mean-Shift Learning for Generalized Category Discovery (2404.09451v1)
Abstract: We address the problem of generalized category discovery (GCD) that aims to partition a partially labeled collection of images; only a small part of the collection is labeled and the total number of target classes is unknown. To address this generalized image clustering problem, we revisit the mean-shift algorithm, i.e., a classic, powerful technique for mode seeking, and incorporate it into a contrastive learning framework. The proposed method, dubbed Contrastive Mean-Shift (CMS) learning, trains an image encoder to produce representations with better clustering properties by an iterative process of mean shift and contrastive update. Experiments demonstrate that our method, both in settings with and without the total number of clusters being known, achieves state-of-the-art performance on six public GCD benchmarks without bells and whistles.
- K-means++ the advantages of careful seeding. In Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms, pages 1027–1035, 2007.
- Mean shift based clustering of hough domain for fast line segment detection. Pattern Recognition Letters, 2006.
- Open-world semi-supervised learning. In Proc. International Conference on Learning Representations (ICLR), 2022.
- Emerging properties in self-supervised vision transformers. In Proc. IEEE International Conference on Computer Vision (ICCV), 2021.
- A simple framework for contrastive learning of visual representations. In Proc. International Conference on Machine Learning (ICML), 2020.
- Yizong Cheng. Mean shift, mode seeking, and clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 1995.
- Parametric information maximization for generalized category discovery. In Proc. IEEE International Conference on Computer Vision (ICCV), pages 1729–1739, 2023.
- Mode-seeking on graphs via random walks. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2012.
- Mean shift: a robust approach toward feature space analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2002.
- Real-time tracking of non-rigid objects using mean shift. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2000.
- Kernel-based object tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2003.
- An image is worth 16x16 words: Transformers for image recognition at scale. In Proc. International Conference on Learning Representations (ICLR), 2021.
- A density-based algorithm for discovering clusters in large spatial databases with noise. In kdd, 1996.
- A unified objective for novel class discovery. In Proc. IEEE International Conference on Computer Vision (ICCV), 2021.
- The estimation of the gradient of a density function, with applications in pattern recognition. IEEE Transactions on information theory, 1975.
- Imagenet-trained cnns are biased towards texture; increasing shape bias improves accuracy and robustness. In Proc. International Conference on Learning Representations (ICLR), 2019.
- Learning to discover novel visual categories via deep transfer clustering. In Proc. IEEE International Conference on Computer Vision (ICCV), pages 8401–8409, 2019.
- Automatically discovering and learning new visual categories with ranking statistics. In Proc. International Conference on Learning Representations (ICLR), 2020.
- Gaussian error linear units (gelus). arXiv preprint arXiv:1606.08415, 2016.
- Meanshift++: Extremely fast mode-seeking with applications to segmentation and object tracking. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
- Distilling self-supervised vision transformers for weakly-supervised few-shot classification & segmentation. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
- Supervised contrastive learning. Advances in Neural Information Processing Systems (NeurIPS), 2020.
- Von mises-fisher mean shift for clustering on a hypersphere. In 2010 20th International Conference on Pattern Recognition, pages 2130–2133. IEEE, 2010.
- Recurrent pixel embedding for instance grouping. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 9018–9028, 2018.
- 3d object representations for fine-grained categorization. In Proceedings of the IEEE international conference on computer vision workshops, pages 554–561, 2013.
- Learning multiple layers of features from tiny images. Technical report, 2009.
- Harold W Kuhn. The hungarian method for the assignment problem. Naval research logistics quarterly, 2(1-2):83–97, 1955.
- Gridshift: A faster mode-seeking algorithm for image segmentation and object tracking. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
- Laurens van der Maaten and Geoffrey Hinton. Visualizing data using t-sne. Journal of Machine Learning Research (JMLR), 2008.
- James MacQueen et al. Some methods for classification and analysis of multivariate observations. In Proceedings of the fifth Berkeley symposium on mathematical statistics and probability. Oakland, CA, USA, 1967.
- Fine-grained visual classification of aircraft. arXiv preprint arXiv:1306.5151, 2013.
- Directional statistics. Wiley Online Library, 2000.
- Dynamic conceptional contrastive learning for generalized category discovery. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 7579–7588, 2023.
- Learning transferable visual models from natural language supervision. In Proc. International Conference on Machine Learning (ICML), pages 8748–8763. PMLR, 2021.
- Meta-learning for semi-supervised few-shot classification. In Proc. International Conference on Learning Representations (ICLR), 2018.
- Sebastian Ruder. An overview of gradient descent optimization algorithms. arXiv preprint arXiv:1609.04747, 2016.
- Imagenet large scale visual recognition challenge. International Journal of Computer Vision (IJCV), 115(3):211–252, 2015.
- David W Scott. Multivariate density estimation: theory, practice, and visualization. John Wiley & Sons, 2015.
- Robin Sibson. Slink: an optimally efficient algorithm for the single-link cluster method. The computer journal, 1973.
- The herbarium challenge 2019 dataset. arXiv preprint arXiv:1906.05372, 2019.
- Attention is all you need. In Advances in Neural Information Processing Systems (NeurIPS), 2017.
- Open-set recognition: A good closed-set classifier is all you need? arXiv preprint arXiv:2110.06207, 2021.
- Generalized category discovery. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
- The caltech-ucsd birds-200-2011 dataset. Technical report, 2011.
- Joe H Ward Jr. Hierarchical grouping to optimize an objective function. Journal of the American statistical association, 58(301):236–244, 1963.
- Parametric classification for generalized category discovery: A baseline study. In Proc. IEEE International Conference on Computer Vision (ICCV), pages 16590–16600, 2023.
- Mean shift-based clustering. Pattern Recognition, 2007.
- Semi-supervised domain adaptation via sample-to-sample self-distillation. In IEEE Winter Conference on Applications of Computer Vision (WACV), pages 1978–1987, 2022.
- Agglomerative mean-shift clustering. IEEE Transactions on Knowledge and Data Engineering, 2010.
- Promptcal: Contrastive affinity learning via auxiliary prompts for generalized novel category discovery. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 3479–3488, 2023.
- Learning semi-supervised gaussian mixture models for generalized category discovery. Proc. IEEE International Conference on Computer Vision (ICCV), 2023.
- Collaborative learning of semi-supervised segmentation and classification for medical images. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
- Local aggregation for unsupervised learning of visual embeddings. In Proc. IEEE International Conference on Computer Vision (ICCV), 2019.
Sponsored by Paperpile, the PDF & BibTeX manager trusted by top AI labs.
Get 30 days freePaper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.