A Survey of Deep Meta-Learning

Published 7 Oct 2020 in cs.LG, cs.AI, and stat.ML | (2010.03522v2)

Abstract: Deep neural networks can achieve great successes when presented with large data sets and sufficient computational resources. However, their ability to learn new concepts quickly is limited. Meta-learning is one approach to address this issue, by enabling the network to learn how to learn. The field of Deep Meta-Learning advances at great speed, but lacks a unified, in-depth overview of current techniques. With this work, we aim to bridge this gap. After providing the reader with a theoretical foundation, we investigate and summarize key methods, which are categorized into i)~metric-, ii)~model-, and iii)~optimization-based techniques. In addition, we identify the main open challenges, such as performance evaluations on heterogeneous benchmarks, and reduction of the computational costs of meta-learning.

Abstract PDF Upgrade to Chat

Citations (281)

View on Semantic Scholar

Summary

The paper surveys deep meta-learning, classifying methods into metric-based, model-based, and optimization-based approaches to structure the field.
It details the mechanisms and characteristics of each major category, noting strengths like robustness in metric-based, flexibility in model-based, and strong adaptation in optimization-based methods.
Challenges such as overfitting and out-of-distribution generalization are discussed alongside future directions including new benchmarks and reducing computational costs.

Deep Meta-Learning: A Comprehensive Survey

Deep meta-learning has gained significant traction as a method for enabling neural networks to rapidly adapt and learn new tasks from minimal data. This paper, "A Survey of Deep Meta-Learning" by Huisman, van Rijn, and Plaat, provides an exhaustive examination of the field, categorizing approaches into metric-based, model-based, and optimization-based strategies. The paper aims to offer a coherent overview and establish a framework for understanding deep meta-learning techniques, informed by their strength in leveraging prior experiences to expedite learning on new tasks.

The paper outlines the deep meta-learning paradigm, contrasting it with traditional learning approaches and examining the broader context of meta-learning in domains such as transfer and multi-task learning. It underscores the practical significance of meta-learning, particularly in applications constrained by limited data and resource availability, and explores contemporary techniques from both theoretical and practical perspectives.

Metric-based Techniques

Metric-based meta-learning focuses on calculating input similarities in an effective feature space. Techniques such as Siamese networks and matching networks initiate this approach, utilizing pair-wise comparisons to predict class memberships based on learned embeddings. Subsequent developments, such as prototypical networks and relation networks, refine the process by introducing conceptual innovations like class prototypes and neural similarity metrics. Although simple and robust, metric-based methods are predominantly confined to supervised learning contexts.

Model-based Techniques

Model-based approaches leverage internal states and task embeddings to drive learning, capturing task-specific information through dynamic internal architectures. Methods like memory-augmented neural networks harness external memory structures to facilitate rapid learning. Meanwhile, neural networks such as SNAIL incorporate attention mechanisms to address the memory constraints of recurrent models. Despite their flexibility and conceptual appeal, model-based techniques often struggle with large task sizes and interactational complexities.

Optimization-based Techniques

By framing meta-learning as a bi-level optimization challenge, optimization-based approaches aim to fine-tune task adaptation strategies. Initially epitomized by the LSTM optimizer, which learns the update rules, this category has progressed with techniques like MAML and Reptile that focus on optimizing initial weight settings for swift adaptation using gradient-based methods. Probabilistic extensions, such as PLATIPUS and LLAMA, enhance generalization by considering uncertainty, though these extensions may increase computational overhead.

Empirical Evaluation and Comparison

Performance metrics on standard benchmarks, notably miniImageNet, reveal significant variances across techniques. Optimization-based methods generally yield superior performance, aligning with their robust adaptive capabilities across more diverse task distributions. Nevertheless, enhanced network architectures often underpin this success, signaling the interplay between methodological advancement and infrastructure capabilities.

Challenges and Future Directions

The survey addresses several open challenges within deep meta-learning, notably the tendency toward overfitting in limited data scenarios, and difficulty generalizing to tasks outside the training distribution. The authors call attention to Meta-Dataset, an expanded benchmark designed to test broader task generalization, as a future research impetus. They advocate for reducing meta-learning's computational demands and exploring domains such as online and active learning settings. Furthermore, the authors highlight the potential power of compositional architectures and hierarchical meta-learning frameworks to further deepen learning efficiency and effectiveness.

This paper serves as both a foundational introduction and a critical reference, offering insights and identifying pathways for future research in the evolving arena of deep meta-learning. As the field matures, emphasizing tasks closer to practical scenarios while exploring improved generalization mechanisms will likely become focal points for advancing this promising amalgam of learning and adaptation.

Markdown