Papers

Topics

Authors

Recent

View all

Gemini 2.5 Flash

41 tokens/sec

GPT-4o

59 tokens/sec

Gemini 2.5 Pro Pro

41 tokens/sec

o3 Pro

7 tokens/sec

GPT-4.1 Pro

50 tokens/sec

DeepSeek R1 via Azure Pro

28 tokens/sec

2000 character limit reached

Meta-Transfer Learning for Few-Shot Learning (1812.02391v3)

Published 6 Dec 2018 in cs.CV

Abstract: Meta-learning has been proposed as a framework to address the challenging few-shot learning setting. The key idea is to leverage a large number of similar few-shot tasks in order to learn how to adapt a base-learner to a new task for which only a few labeled samples are available. As deep neural networks (DNNs) tend to overfit using a few samples only, meta-learning typically uses shallow neural networks (SNNs), thus limiting its effectiveness. In this paper we propose a novel few-shot learning method called meta-transfer learning (MTL) which learns to adapt a deep NN for few shot learning tasks. Specifically, "meta" refers to training multiple tasks, and "transfer" is achieved by learning scaling and shifting functions of DNN weights for each task. In addition, we introduce the hard task (HT) meta-batch scheme as an effective learning curriculum for MTL. We conduct experiments using (5-class, 1-shot) and (5-class, 5-shot) recognition tasks on two challenging few-shot learning benchmarks: miniImageNet and Fewshot-CIFAR100. Extensive comparisons to related works validate that our meta-transfer learning approach trained with the proposed HT meta-batch scheme achieves top performance. An ablation study also shows that both components contribute to fast convergence and high accuracy.

PDF Abstract

Meta-Transfer Learning for Few-Shot Learning

"Meta-Transfer Learning for Few-Shot Learning" presents a novel approach to address the few-shot learning problem by leveraging both meta-learning and transfer learning paradigms. Few-shot learning aims to learn new concepts from only a few labeled examples, which remains a challenging problem for deep neural networks (DNNs) due to their tendency to overfit with limited data.

Meta-Transfer Learning Framework

The paper introduces Meta-Transfer Learning (MTL), which offers a method to adapt deep neural networks for few-shot learning tasks. The core idea of MTL involves training on multiple tasks and adapting the weights of a pre-trained DNN using scaling and shifting functions specific to each task. The key contributions of this approach are twofold:

Scaling and Shifting (SS) Operations: These operations allow MTL to modify the pre-trained DNN weights with minimal additional parameters, thereby reducing the risk of overfitting and avoiding catastrophic forgetting.
Hard Task (HT) Meta-Batch Strategy: This learning curriculum dynamically re-samples and focuses on harder tasks by identifying and leveraging the failure classes. Such a strategy ensures that the meta-learner grows more effectively by confronting more challenging scenarios.

Experimental Setup and Results

The authors conduct extensive experiments on two well-known few-shot learning benchmarks: miniImageNet and Fewshot-CIFAR100 (FC100). A series of tasks, including 5-class 1-shot and 5-shot classification, are used to evaluate the methods.

miniImageNet

Using a ResNet-12 architecture pre-trained on large-scale data, MTL achieved significant improvements in classification accuracy compared to both baseline and state-of-the-art methods. In particular, the approach achieved:

60.2% accuracy in 1-shot and 74.3% in 5-shot tasks using meta-batch strategies.
61.2% accuracy in 1-shot and 75.5% in 5-shot tasks using HT meta-batch strategies.

These results are particularly impressive given that they outperformed several existing methods, including those utilizing complex meta-learning strategies like Model-Agnostic Meta-Learning (MAML) and memory networks.

Fewshot-CIFAR100

On the FC100 benchmark, which introduces stricter training-test splits and more challenging meta-learning scenarios, MTL again outperformed competitive baselines:

43.6% accuracy in 1-shot and 55.4% in 5-shot tasks using meta-batch strategies.
45.1% accuracy in 1-shot and 57.6% in 5-shot tasks using HT meta-batch strategies.

Ablation Studies

To substantiate the contributions of various MTL components, the authors carried out ablation studies comparing several configurations:

Baseline Methods: Demonstrated the performance of training without any meta-learning or transfer learning.
Fine-Tuning (FT) Variants: Explored the effects of fine-tuning different parts of the network (classifier only, final block, or entire network).

Results showed that FT methods significantly underperformed compared to MTL, reiterating the value of the SS operations and the meta-batch strategies in achieving higher accuracy and faster convergence.

Implications and Future Work

The insights from this paper have theoretical and practical implications. By illustrating that transfer learning can be effectively combined with meta-learning through SS operations, the work demonstrates a feasible path to overcoming overfitting in few-shot scenarios. Moreover, the HT meta-batch strategy opens avenues for curriculum learning-based improvements in meta-training.

Future developments could focus on further refining the SS operations, exploring different architectures, and evaluating the adaptability of MTL in other few-shot learning contexts, such as few-shot reinforcement learning or unsupervised tasks. Additionally, understanding the broader applicability of HT meta-batch strategies across various machine learning paradigms could also be a promising direction.

Overall, this paper contributes valuable methodologies to the ongoing challenge of few-shot learning, providing both robust theoretical underpinnings and practical advancements.

PDF Markdown Bookmark Chat (Pro)

Authors (4)

Qianru Sun (65 papers)
Yaoyao Liu (19 papers)
Tat-Seng Chua (359 papers)
Bernt Schiele (210 papers)

Citations (1,008)

View on Semantic Scholar