Revisiting Local Descriptor based Image-to-Class Measure for Few-shot Learning (1903.12290v2)

Published 28 Mar 2019 in cs.CV

Abstract: Few-shot learning in image classification aims to learn a classifier to classify images when only few training examples are available for each class. Recent work has achieved promising classification performance, where an image-level feature based measure is usually used. In this paper, we argue that a measure at such a level may not be effective enough in light of the scarcity of examples in few-shot learning. Instead, we think a local descriptor based image-to-class measure should be taken, inspired by its surprising success in the heydays of local invariant features. Specifically, building upon the recent episodic training mechanism, we propose a Deep Nearest Neighbor Neural Network (DN4 in short) and train it in an end-to-end manner. Its key difference from the literature is the replacement of the image-level feature based measure in the final layer by a local descriptor based image-to-class measure. This measure is conducted online via a $k$-nearest neighbor search over the deep local descriptors of convolutional feature maps. The proposed DN4 not only learns the optimal deep local descriptors for the image-to-class measure, but also utilizes the higher efficiency of such a measure in the case of example scarcity, thanks to the exchangeability of visual patterns across the images in the same class. Our work leads to a simple, effective, and computationally efficient framework for few-shot learning. Experimental study on benchmark datasets consistently shows its superiority over the related state-of-the-art, with the largest absolute improvement of $17\%$ over the next best. The source code can be available from \UrlFont{https://github.com/WenbinLee/DN4.git}.

Citations (458)

View on Semantic Scholar

Summary

The paper introduces DN4, a method that replaces image-level features with a local descriptor-based image-to-class measure for improved few-shot learning.
It employs an episodic training mechanism and k-nearest neighbor search on deep local descriptors to enhance discriminative power.
Experimental results demonstrate up to a 17% accuracy improvement on benchmarks such as miniImageNet, validating its effectiveness.

Revisiting Local Descriptor Based Image-to-Class Measure for Few-shot Learning

The paper "Revisiting Local Descriptor based Image-to-Class Measure for Few-shot Learning" introduces a novel approach for tackling the challenges inherent in few-shot learning tasks, where the objective is to classify images with only a few examples available for each class. The authors propose a method called Deep Nearest Neighbor Neural Network (DN4), which leverages local descriptor-based image-to-class measures that deviate from the traditional image-level feature-based measures commonly used in previous approaches.

Key Contributions

Local Descriptor Emphasis: The authors argue that summarizing images into compact, image-level representations can lead to a significant loss of discriminative information, particularly detrimental in few-shot learning due to the limited training samples. Instead, DN4 utilizes local descriptor-based measures that provide a richer representation and leverage the exchangeability of visual patterns across images in the same class.
Image-to-Class Measure: Inspired by the principles of the Naive-Bayes Nearest-Neighbor (NBNN) method, DN4 replaces image-level features in the final classification layer with a local descriptor-based image-to-class measure. This approach involves a $k$ -nearest neighbor search over deep local descriptors obtained from convolutional feature maps.
Episodic Training Mechanism: The framework is trained in an end-to-end manner using episodic training, a strategy that effectively simulates few-shot learning conditions during the training phase, resulting in better adaptation to new tasks with limited examples.
Experimental Superiority: The DN4 framework demonstrates superior performance across multiple benchmark datasets, achieving up to a 17% improvement over existing state-of-the-art methods in some cases. This marks a significant advancement in effectively handling few-shot learning scenarios.

Numerical Results

The DN4 method was evaluated on datasets such as miniImageNet, Stanford Dogs, Stanford Cars, and CUB-200. On the miniImageNet dataset, DN4 outperformed other recent methods with accuracy improvements from 50.44% to 51.24% for 1-shot tasks and from 66.53% to 71.02% for 5-shot tasks. These results underscore the efficacy of local descriptor-based approaches in a few-shot learning context.

Implications and Future Directions

The proposed DN4 framework holds significant implications for both theoretical and practical applications in few-shot learning. By demonstrating the advantages of local descriptor-based measures, this work challenges the prevailing reliance on image-level features and highlights a different approach to improve classification performance in data-scarce environments.

Future developments may consider the exploration of even more sophisticated local descriptors and image-to-class similarity measures. There is also potential for extending this framework to other domains like video classification or object detection, where the scarcity of labeled data is a common challenge.

In conclusion, by re-evaluating the use of local descriptors and employing a robust episodic training approach, this paper contributes to a better understanding and handling of few-shot learning problems, paving the way for further innovations in model training and feature representation paradigms.

PDF Markdown

Related Papers

GitHub

GitHub - WenbinLee/DN4: Pytorch code of "Revisiting Local Descriptor based Image-to-Class Measure for Few-shot Learning", CVPR 2019. (191 stars)