Few-Shot Image Recognition by Predicting Parameters from Activations (1706.03466v3)

Published 12 Jun 2017 in cs.CV

Abstract: In this paper, we are interested in the few-shot learning problem. In particular, we focus on a challenging scenario where the number of categories is large and the number of examples per novel category is very limited, e.g. 1, 2, or 3. Motivated by the close relationship between the parameters and the activations in a neural network associated with the same category, we propose a novel method that can adapt a pre-trained neural network to novel categories by directly predicting the parameters from the activations. Zero training is required in adaptation to novel categories, and fast inference is realized by a single forward pass. We evaluate our method by doing few-shot image recognition on the ImageNet dataset, which achieves the state-of-the-art classification accuracy on novel categories by a significant margin while keeping comparable performance on the large-scale categories. We also test our method on the MiniImageNet dataset and it strongly outperforms the previous state-of-the-art methods.

Authors (4)

Siyuan Qiao (40 papers)
Chenxi Liu (85 papers)
Wei Shen (181 papers)
Alan Yuille (295 papers)

Citations (543)

View on Semantic Scholar

Summary

The paper introduces a novel approach to predict classification parameters from activations for efficient few-shot image recognition.
It leverages a pre-trained network and a learned feedforward mapping to convert activation statistics into classification layer parameters with a single forward pass.
Experimental results on ImageNet and MiniImageNet demonstrate significant accuracy gains and computational efficiency over traditional retraining methods.

Few-Shot Image Recognition by Predicting Parameters from Activations

The paper "Few-Shot Image Recognition by Predicting Parameters from Activations" addresses the challenge of few-shot learning in image recognition. The authors introduce a novel approach that adapts a pre-trained neural network to novel categories with minimal exemplars, leveraging the relationship between neural network parameters and activations. This method foregoes the need for traditional training when adapting to new categories and can perform rapid inference with a single forward pass.

Methodology

The proposed approach starts with a neural network pre-trained on a large-scale dataset, $\mathcal{D}_{\text{large}}$ , and focuses on adapting it to a smaller, few-shot dataset, $\mathcal{D}_{\text{few}}$ . The core innovation lies in directly predicting category-specific parameters from the activations of associated images. This is a departure from conventional methods that rely on extensive retraining or parameter fine-tuning.

The authors hypothesize a strong correlation between the activations and the parameters for classification layers. By visualizing these entities using techniques like t-SNE, the paper illustrates their structural similarity, particularly in terms of semantic groupings. A category-agnostic mapping is devised to transform activation statistics, specifically their mean, into classification layer parameters.

This mapping is implemented as a learned feedforward network, tasked with transforming activation statistics to parameter predictions. Once established, this network supports both efficient inference and the swift adaptation of new categories, necessitating merely an update of the activation statistics.

Experimental Results

The effectiveness of this methodology is evaluated on two datasets: the large-scale ImageNet and MiniImageNet. The approach achieved state-of-the-art classification accuracy on ImageNet's novel categories with a significant margin over previous methods while maintaining performance on the original categories. Additionally, it outperformed existing approaches on the MiniImageNet dataset.

Particularly noteworthy are the few-shot accuracies obtained in 1,000-way classification tasks on ImageNet, showcasing robustness even with only a few examples per category. The method's advantages are further borne out in its computational efficiency during both adaptation and inference phases, marking a distinct improvement over traditional non-parametric and parametric approaches.

Implications and Future Work

The implications of this research are significant for few-shot learning and the broader field of computer vision. By reducing dependency on large-scale retraining and data availability, this work paves the way for more practical AI applications where rapid learning from minimal data is crucial.

Future research could explore further optimizing the parameter prediction network and investigate scaling it to more complex or hierarchical datasets. Additionally, studying the applicability of this approach to other domains, such as natural language processing, may reveal further potentials of parameter prediction from activations.

In conclusion, the paper presents a promising advance in few-shot image recognition, balancing the need for minimal data usage with sophisticated parameter prediction and adaptation techniques. This blend of efficiency and effectiveness holds significant promise for real-world applications where rapid learning from limited examples is often a necessity.