Unsupervised Feature Learning via Non-Parametric Instance-level Discrimination (1805.01978v1)

Published 5 May 2018 in cs.CV and cs.LG

Abstract: Neural net classifiers trained on data with annotated class labels can also capture apparent visual similarity among categories without being directed to do so. We study whether this observation can be extended beyond the conventional domain of supervised learning: Can we learn a good feature representation that captures apparent similarity among instances, instead of classes, by merely asking the feature to be discriminative of individual instances? We formulate this intuition as a non-parametric classification problem at the instance-level, and use noise-contrastive estimation to tackle the computational challenges imposed by the large number of instance classes. Our experimental results demonstrate that, under unsupervised learning settings, our method surpasses the state-of-the-art on ImageNet classification by a large margin. Our method is also remarkable for consistently improving test performance with more training data and better network architectures. By fine-tuning the learned feature, we further obtain competitive results for semi-supervised learning and object detection tasks. Our non-parametric model is highly compact: With 128 features per image, our method requires only 600MB storage for a million images, enabling fast nearest neighbour retrieval at the run time.

Citations (3,248)

View on Semantic Scholar

Summary

The paper introduces a non-parametric softmax classifier that treats each image as a unique class to enhance feature learning.
It leverages noise-contrastive estimation and proximal regularization to achieve superior performance on benchmarks like ImageNet and CIFAR-10.
The approach demonstrates strong transferability in semi-supervised learning and object detection, reducing reliance on annotated data.

Unsupervised Feature Learning via Non-Parametric Instance Discrimination

Overview

The paper "Unsupervised Feature Learning via Non-Parametric Instance Discrimination" introduces a novel method to learn feature representations without using annotated data. This paper is particularly significant in the field of computer vision and unsupervised learning, especially given the challenges and costs associated with obtaining labeled datasets.

Methodology

The authors propose an unsupervised learning approach that treats each image instance as a unique class in itself, contrasting with the conventional reliance on semantic categories. The method employs a non-parametric classification strategy, using noise-contrastive estimation (NCE) to enable efficient learning despite the high number of unique instance classes.

Key Innovations:

Non-Parametric Softmax Classifier: Traditional softmax classifiers are parametric and rely on a fixed set of class weights. This approach replaces class-specific weights with the normalized feature vectors of each instance, allowing the model to generalize readily to unseen instances.
Instance-Level Discrimination: Instead of classifying images into pre-defined categories, the model is trained to maximize the separability of each individual instance. This leverage results in each image being discriminated against all other images in the dataset.
Noise-Contrastive Estimation: NCE is used to address the scalability issue brought about by treating each image as a distinct class. This technique converts the multi-class classification problem into a binary one, significantly reducing computational complexity.
Proximal Regularization: A proximal optimization term is introduced to stabilize the learning dynamics, reducing oscillations during training and aiding in smoother convergence.

Experimental Results

The proposed method demonstrates superior performance across various benchmarks compared to other state-of-the-art unsupervised learning methods:

ImageNet Classification: The non-parametric approach achieves a top-1 accuracy of 46.5% with a ResNet-50 architecture, surpassing several baseline methods including self-supervised and adversarial learning techniques.
CIFAR-10: Without approximation, the non-parametric softmax method attains a significant improvement in classification accuracy compared to the parametric version. For example, a nearest neighbor classifier utilizing learned features achieves 80.8% accuracy compared to 63.0% with the parametric softmax.
Feature Compactness: The learned 128-dimensional feature representations are highly compact, enabling efficient storage (600MB for a million images) and fast nearest-neighbor retrieval at runtime.

Generalization and Transferability

The approach also shows competitive results in various tasks beyond standard classification, such as:

Semi-Supervised Learning: The method scales effectively with the availability of labeled data, demonstrating that pretraining on unlabeled data can significantly enhance performance when fine-tuned with a smaller labeled subset. For instance, with only 1% labeled data in ImageNet, the model's top-5 accuracy substantially surpasses that of training from scratch.
Object Detection: Fine-tuning the learned features on PASCAL VOC 2007 for object detection tasks yields mean average precision (mAP) scores that are competitive with state-of-the-art methods. The paper reports an mAP of 65.4% using ResNet-50, highlighting effective generalization capabilities.

Implications and Future Work

The implications of this research are multifaceted. Practically, the reduction in reliance on annotated datasets promises significant cost savings and opens up opportunities for applications where labeled data is scarce or unavailable. Theoretically, the paper suggests that instance-level discrimination might inherently capture semantic similarity effectively, challenging the notion that annotations are indispensable for meaningful feature learning.

Future research could explore further enhancements to non-parametric models, such as integrating additional forms of regularization or developing more efficient approximations for even larger datasets. Another promising direction is the adaptation of this approach to other domains beyond visual tasks, such as text or audio processing, to validate the robustness and versatility of instance-level discrimination as a universal unsupervised learning paradigm.

Conclusion

This paper offers a profound contribution to the field of unsupervised learning, presenting a scalable and effective method for learning discriminative feature representations without labeled data. Its strong empirical performance and potential for broader application underscore the viability of non-parametric instance discrimination as a valuable alternative to traditional supervised methods.

PDF Markdown

Related Papers

YouTube

Show All Videos