Unsupervised Embedding Learning via Invariant and Spreading Instance Feature (1904.03436v1)

Published 6 Apr 2019 in cs.CV

Abstract: This paper studies the unsupervised embedding learning problem, which requires an effective similarity measurement between samples in low-dimensional embedding space. Motivated by the positive concentrated and negative separated properties observed from category-wise supervised learning, we propose to utilize the instance-wise supervision to approximate these properties, which aims at learning data augmentation invariant and instance spread-out features. To achieve this goal, we propose a novel instance based softmax embedding method, which directly optimizes the `real' instance features on top of the softmax function. It achieves significantly faster learning speed and higher accuracy than all existing methods. The proposed method performs well for both seen and unseen testing categories with cosine similarity. It also achieves competitive performance even without pre-trained network over samples from fine-grained categories.

Citations (553)

View on Semantic Scholar

Summary

The paper's main contribution is a novel softmax embedding method that directly optimizes unsupervised instance features using augmentation invariance and spreading strategies.
It utilizes a Siamese network and inner-product based softmax optimization to achieve robust separation between instance representations.
Empirical tests on CIFAR-10 and STL-10 show improved kNN accuracy and strong generalization, outperforming previous unsupervised methods.

Unsupervised Embedding Learning via Invariant and Spreading Instance Feature

The paper presents a refined approach to unsupervised embedding learning, focusing on optimizing instance-wise feature representation through data augmentation invariance and instance spread-out properties. Traditional supervised embedding methods rely on annotated data to achieve tightly clustered positive samples and well-separated negative samples. The research addresses the challenge of achieving these without the use of labeled datasets, which are often costly and laborious to obtain.

Core Innovation

The proposed method employs a novel instance-based softmax embedding, maximizing the efficiency and accuracy of learning in unsupervised settings. This approach circumvents the inefficiencies and limitations found in existing methods using class weights or memory banks for softmax functions. Instead, it directly optimizes embeddings by leveraging inner product calculations within the softmax, significantly enhancing performance over competing methods.

Methodology

Data Augmentation Invariance: The features of the same instance, subject to various data augmentations, are designed to be invariant. This ensures that the distance between augmented features of the same sample remains minimal.
Instance Spread-Out: The model encourages a separation of features between different instances by treating randomly selected instances as negatives. This assumption promotes a spread-out property, creating a more discriminative feature space.

The technique incorporates a Siamese network training strategy, transforming a multi-class classification problem into binary classification via maximum likelihood estimation.

Empirical Results

The method outperforms existing unsupervised approaches on benchmark datasets like CIFAR-10 and STL-10, achieving superior kNN accuracy (83.6% on CIFAR-10) within substantially fewer training epochs. It is also competitive when compared to supervised methods on fine-grained datasets such as CUB200. Importantly, it demonstrates robustness and generalization, performing well even on unseen category testing.

Experiments confirmed the significance of data augmentation and instance spread-out strategies, with a detailed ablation paper reinforcing the importance of each component.

Implications and Future Directions

This research contributes significantly to the field of unsupervised learning by introducing a more effective and efficient embedding learning framework. Its ability to generalize across unseen categories without relying on annotated data could significantly impact large-scale vision tasks, making it practical for applications without comprehensive labeled datasets.

Future research might explore the adaptability of this method to different domains, such as video processing or higher-dimensional data, and investigate its integration with other unsupervised approaches for improved performance.

This paper takes a substantive step towards more autonomous learning systems capable of mimicking human-like understanding and categorization in the absence of explicit supervision.

PDF Markdown