Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
117 tokens/sec
GPT-4o
8 tokens/sec
Gemini 2.5 Pro Pro
47 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

SPICE: Semantic Pseudo-labeling for Image Clustering (2103.09382v3)

Published 17 Mar 2021 in cs.CV and cs.AI

Abstract: The similarity among samples and the discrepancy between clusters are two crucial aspects of image clustering. However, current deep clustering methods suffer from the inaccurate estimation of either feature similarity or semantic discrepancy. In this paper, we present a Semantic Pseudo-labeling-based Image ClustEring (SPICE) framework, which divides the clustering network into a feature model for measuring the instance-level similarity and a clustering head for identifying the cluster-level discrepancy. We design two semantics-aware pseudo-labeling algorithms, prototype pseudo-labeling, and reliable pseudo-labeling, which enable accurate and reliable self-supervision over clustering. Without using any ground-truth label, we optimize the clustering network in three stages: 1) train the feature model through contrastive learning to measure the instance similarity, 2) train the clustering head with the prototype pseudo-labeling algorithm to identify cluster semantics, and 3) jointly train the feature model and clustering head with the reliable pseudo-labeling algorithm to improve the clustering performance. Extensive experimental results demonstrate that SPICE achieves significant improvements (~10%) over existing methods and establishes the new state-of-the-art clustering results on six image benchmark datasets in terms of three popular metrics. Importantly, SPICE significantly reduces the gap between unsupervised and fully-supervised classification; e.g., there is only a 2% (91.8% vs 93.8%) accuracy difference on CIFAR-10. Our code has been made publically available at https://github.com/niuchuangnn/SPICE.

Citations (132)

Summary

  • The paper introduces SPICE, a novel framework that uses semantic pseudo-labeling to bridge the gap between instance-level similarity and semantic-level discrepancies.
  • It employs a three-stage approach that integrates contrastive learning, prototype pseudo-labeling, and joint training to incrementally refine feature representation and clustering accuracy.
  • Experimental results demonstrate ~10% improvements in clustering metrics, nearly matching supervised performance on benchmarks like CIFAR-10.

Overview of SPICE: Semantic Pseudo-Labeling for Image Clustering

The paper introduces SPICE, a framework leveraging semantic pseudo-labeling for unsupervised image clustering, addressing prevalent challenges in deep learning-based methods related to instance-level similarity and semantic-level discrepancy measurements. SPICE is structured to incrementally train a clustering network by modifying and refining traditional clustering approaches, ultimately optimizing both feature representation and class prediction accuracy.

Methodology

SPICE splits the clustering process into three distinct stages, each tailored to address different aspects of the clustering task:

  1. Feature Model Training with Contrastive Learning: This stage employs a contrastive learning paradigm to enhance the feature model's ability to distinguish between different instances. Contrastive learning, with its reliance on instance discrimination tasks, provides an unsupervised mechanism to pull together representations of different transformations of the same image while pushing apart those of different images. The feature model, F\mathcal{F}, is trained to produce highly discriminative features without the need for labeled data, leveraging the instance-level information inherent in the dataset.
  2. Clustering Head Training with Prototype Pseudo-Labeling: The heart of SPICE's novelty lies in its semantic-aware clustering strategy. The framework divides the clustering network into the feature model and a clustering head, C\mathcal{C}. The training process employs a prototype pseudo-labeling algorithm, which iteratively estimates cluster centers using the most confident samples and assigns pseudo-labels to images based on their proximity to these centers in the feature space. This method captures both the similarity among instances and the semantic discrepancy between clusters, aiming for a more accurate clustering head training through an expectation-maximization framework.
  3. Joint Training with Reliable Pseudo-Labeling: The final stage seeks to reinforce and refine the model by integrating reliable samples identified using consistency and confidence metrics. These samples originate from the feature model's unsupervised output, aiming to filter out noise and focus on accurately tagged data points. Joint training utilizing this subset of images allows both the feature model and clustering head to synergistically improve under a semi-supervised learning paradigm, tapping into the identified semantic information across the data.

Results and Implications

The paper’s experimental results highlight the performance gains of SPICE across standard image clustering benchmark datasets such as CIFAR-10, CIFAR-100-20, and STL10. SPICE consistently elevates clustering accuracy, normalized mutual information (NMI), and adjusted rand index (ARI) by a notable margin (~10% improvement), establishing new state-of-the-art results. A significant outcome of SPICE is its dramatic narrowing of the gap between unsupervised and supervised classification. On CIFAR-10, SPICE’s accuracy scales to 91.8%, just shy of the fully-supervised mark of 93.8%.

Future Prospects

SPICE underscores the potential of leveraging semantics through pseudo-labeling, setting a precedent for evolution in unsupervised learning tasks, notably clustering. Future work could explore automation of cluster number determination, currently a predefined aspect in most methods, including SPICE. Moreover, the reliance on balanced cluster assumptions might be reconsidered for datasets portraying natural distribution variations across clusters. The adaptability of SPICE’s mechanism to tackle these open challenges would signify a significant advancement in the field.

The integration of unsupervised representation learning techniques with robust pseudo-labeling strategies like SPICE might inspire novel algorithms reconfiguring how instances and their semantics are analyzed within the clustering domain, broadening applications across other machine learning tasks.

Github Logo Streamline Icon: https://streamlinehq.com

GitHub

Youtube Logo Streamline Icon: https://streamlinehq.com