Contrastive Learning based Hybrid Networks for Long-Tailed Image Classification (2103.14267v1)

Published 26 Mar 2021 in cs.CV

Abstract: Learning discriminative image representations plays a vital role in long-tailed image classification because it can ease the classifier learning in imbalanced cases. Given the promising performance contrastive learning has shown recently in representation learning, in this work, we explore effective supervised contrastive learning strategies and tailor them to learn better image representations from imbalanced data in order to boost the classification accuracy thereon. Specifically, we propose a novel hybrid network structure being composed of a supervised contrastive loss to learn image representations and a cross-entropy loss to learn classifiers, where the learning is progressively transited from feature learning to the classifier learning to embody the idea that better features make better classifiers. We explore two variants of contrastive loss for feature learning, which vary in the forms but share a common idea of pulling the samples from the same class together in the normalized embedding space and pushing the samples from different classes apart. One of them is the recently proposed supervised contrastive (SC) loss, which is designed on top of the state-of-the-art unsupervised contrastive loss by incorporating positive samples from the same class. The other is a prototypical supervised contrastive (PSC) learning strategy which addresses the intensive memory consumption in standard SC loss and thus shows more promise under limited memory budget. Extensive experiments on three long-tailed classification datasets demonstrate the advantage of the proposed contrastive learning based hybrid networks in long-tailed classification.

View on arXiv

Authors (5)

Peng Wang (832 papers)
Kai Han (184 papers)
Xiu-Shen Wei (40 papers)
Lei Zhang (1689 papers)
Lei Wang (975 papers)

Citations (217)

View on Semantic Scholar

Summary

Contrastive Learning based Hybrid Networks for Long-Tailed Image Classification

The paper "Contrastive Learning based Hybrid Networks for Long-Tailed Image Classification," authored by Peng Wang et al., presents a novel approach to tackle the challenges posed by long-tailed image datasets. While traditional classification tasks often benefit from a balanced dataset, real-world scenarios usually present data in a long-tailed distribution, where the majority of the samples belong to a small number of classes (head classes), and the minority of samples are spread across a large number of classes (tail classes). This imbalance complicates classifier training, leading to biased predictions favoring the head classes.

The research leverages supervised contrastive learning (CL) strategies to enhance the representation learning process on imbalanced datasets, which is crucial for improving classification accuracy. A hybrid network architecture is proposed, combining a supervised contrastive loss (SCL) for feature representation learning with a cross-entropy loss for classification learning. The network seamlessly transitions from focusing on feature learning to classifier learning during training, grounded in the principle that well-differentiated features lead to superior classifiers.

Key novel contributions include two variations of contrastive loss for feature learning: standard supervised contrastive loss (SC) and a prototypical supervised contrastive loss (PSC). Both methodologies aim to draw together samples of the same class while spacing apart samples of different classes in a normalized embedding space. The SC, adapted from recent unsupervised contrastive mechanisms, integrates positive samples from the same class, amplifying intra-class similarity and inter-class dissociation. However, due to the heavy memory requirements inherent in SC, particularly with limited resources, the PSC alternative is introduced. This variant optimizes memory usage by employing a prototypical approach, where each sample attracts its class's prototype and repels those of other classes, thus facilitating efficiency, especially in datasets with numerous classes.

The proposed hybrid networks underwent rigorous evaluation on three datasets known for long-tailed distribution: adapted versions of CIFAR-10 and CIFAR-100, and the large-scale iNaturalist 2018 dataset. The networks demonstrated a marked improvement over existing state-of-the-art methods, affirming the efficacy of integrating contrastive learning strategies in long-tailed contexts. Specifically, the results showed that the proposed networks surpassed traditional cross-entropy-based methods across various imbalance ratios, with the hybrid SC network yielding the most promising results under ample memory conditions, while the hybrid PSC network excelled in constrained environments.

Considering the methodological framework and empirical outcomes, the implications of this paper are multifaceted. Practically, the hybrid networks present a robust tool for applications where data imbalance is pronounced, such as in wildlife conservation data or rare disease diagnostics. Theoretically, the research advances the understanding of how contrastive learning can be adapted and integrated within imbalanced learning frameworks. Looking forward, potential areas for further exploration include the extension of prototypical supervised contrastive learning to consider multiple prototypes per class, potentially enhancing the handling of nuanced intra-class variations and further optimizing memory consumption.

Ultimately, this paper represents a significant stride in refining the learning strategies for imbalanced datasets, offering valuable insights and a robust architecture for future developments in the field of artificial intelligence and beyond.

PDF Markdown

Related Papers

Find Related Papers