Few-Shot Classification with Feature Map Reconstruction Networks (2012.01506v2)

Published 2 Dec 2020 in cs.CV and cs.LG

Abstract: In this paper we reformulate few-shot classification as a reconstruction problem in latent space. The ability of the network to reconstruct a query feature map from support features of a given class predicts membership of the query in that class. We introduce a novel mechanism for few-shot classification by regressing directly from support features to query features in closed form, without introducing any new modules or large-scale learnable parameters. The resulting Feature Map Reconstruction Networks are both more performant and computationally efficient than previous approaches. We demonstrate consistent and substantial accuracy gains on four fine-grained benchmarks with varying neural architectures. Our model is also competitive on the non-fine-grained mini-ImageNet and tiered-ImageNet benchmarks with minimal bells and whistles.

Citations (201)

View on Semantic Scholar

Summary

The paper introduces FRNs that reformulate few-shot classification as a feature map reconstruction problem in latent space.
It employs ridge regression with learned regularization to reconstruct query feature maps, preserving spatial details without extra parameters.
FRN achieves superior accuracy on benchmarks like CUB and Aircraft, especially in 1-shot settings, while maintaining computational efficiency.

Few-Shot Classification with Feature Map Reconstruction Networks

The paper "Few-Shot Classification with Feature Map Reconstruction Networks" introduces a novel approach to few-shot classification by reformulating the task as a feature map reconstruction problem. The authors, Davis Wertheimer, Luming Tang, and Bharath Hariharan from Cornell University, propose Feature Map Reconstruction Networks (FRN) that leverage a closed-form solution to reconstruct query feature maps using support features. This approach is notable for its computational efficiency and performance improvement over existing methods.

Key Contributions

The primary contribution of this paper is the introduction of FRNs, which frame class membership as a reconstruction problem in latent space. The key idea is to predict class membership by evaluating how well a query image's feature map can be reconstructed as a weighted sum of feature vectors from support images. The reconstruction is performed in latent space, preserving spatial details while discarding location-specific information.

Methodology

The FRN approach employs ridge regression to compute the reconstruction in closed form, minimizing mean squared error between the query and reconstructed feature maps. The authors utilize a learned regularization term to stabilize the reconstruction process and ensure discriminative class separation. The methodology does not introduce additional network parameters beyond those required for regularization, maintaining computational efficiency.

Experimental Results

FRN demonstrates substantial improvements in accuracy across multiple fine-grained few-shot classification benchmarks, including CUB (Caltech-UCSD Birds), Aircraft, and meta-iNat datasets, both with and without the use of pre-trained models. In comparisons with state-of-the-art models, FRN consistently outperforms baselines, particularly in 1-shot settings, where fine-grained details are crucial for classification. For general few-shot classification on mini-ImageNet and tiered-ImageNet, FRN exhibits competitive performance.

Computational Efficiency

The paper highlights the computational efficiency of FRN, especially when compared to previous methods like DeepEMD that involve iterative optimization procedures. FRN's closed-form solution allows for efficient parallelization and scalability, offering improved latency and reduced memory usage.

Implications and Future Work

The implications of this work are significant, as it demonstrates the viability of feature map reconstruction for few-shot learning without the need for extensive finetuning or iterative processes. The preservation of spatial details in FRN offers potential for applications where fine-grained distinctions are necessary, such as medical image classification and biodiversity monitoring.

Further research could explore adaptations of FRN for higher-shot learning and explore its applicability in other domains that require rapid adaptation to new classes with limited labeled data. Additionally, extending FRN to multi-modal data and exploring its integration with other meta-learning strategies could lead to further advancements in the field.

In summary, the paper presents a robust and efficient solution to the problem of few-shot classification, with state-of-the-art results and a promising direction for future research in feature map reconstruction in latent spaces.

PDF Markdown

Related Papers

YouTube

Show All Videos