Task Agnostic Continual Learning via Meta Learning (1906.05201v1)

Published 12 Jun 2019 in stat.ML, cs.LG, and cs.NE

Abstract: While neural networks are powerful function approximators, they suffer from catastrophic forgetting when the data distribution is not stationary. One particular formalism that studies learning under non-stationary distribution is provided by continual learning, where the non-stationarity is imposed by a sequence of distinct tasks. Most methods in this space assume, however, the knowledge of task boundaries, and focus on alleviating catastrophic forgetting. In this work, we depart from this view and move the focus towards faster remembering -- i.e measuring how quickly the network recovers performance rather than measuring the network's performance without any adaptation. We argue that in many settings this can be more effective and that it opens the door to combining meta-learning and continual learning techniques, leveraging their complementary advantages. We propose a framework specific for the scenario where no information about task boundaries or task identity is given. It relies on a separation of concerns into what task is being solved and how the task should be solved. This framework is implemented by differentiating task specific parameters from task agnostic parameters, where the latter are optimized in a continual meta learning fashion, without access to multiple tasks at the same time. We showcase this framework in a supervised learning scenario and discuss the implication of the proposed formalism.

PDF Abstract

An Analysis of "Task Agnostic Continual Learning via Meta Learning"

The research paper titled "Task Agnostic Continual Learning via Meta Learning" introduces a novel approach to addressing the challenge of continual learning in non-stationary environments. The authors propose a framework that combines meta-learning techniques with continual learning to achieve more efficient learning, focusing on rapid task recall without prior task boundary information. This paper presents a significant shift in focus from traditional continual learning paradigms aiming to avoid forgetting, to methods that enhance rapid task re-acquisition, termed as "faster remembering."

Problem Definition and Challenges in Continual Learning

In conventional learning models, the assumption that data is independently and identically distributed (i.i.d) is prevalent. However, this assumption fails in many real-world scenarios, such as reinforcement learning and certain supervised learning contexts, where the data distribution can be non-stationary. Continual learning addresses this by considering learning as an ongoing process across a sequence of tasks. Historically, most approaches rely on knowing task boundaries and focus on mitigating catastrophic forgetting—where learning a new task can overwrite existing knowledge. The authors challenge this by targeting scenarios where task boundaries and identities are unknown, contending that existing methodologies lack effectiveness in such contexts.

Proposed Framework: What {Content} How

The paper introduces the "What {content} How" framework, a task-agnostic continual learning model that distinguishes between understanding "what" tasks are and determining "how" to solve them. This framework incorporates task-specific and task-agnostic parameters, where the former is used to infer task representations, and the latter, optimized through meta-learning, directs how tasks should be solved.

Key to this approach is the notion of separating concerns within learning models, arguably allowing the learning system to infer tasks based on contextual data without explicit task labels. By implementing this model in a supervised learning scenario, the researchers exhibit the efficacy of differentiating between what and how to learn across sequential tasks.

Meta Learning and Task Inference Alignment

The authors present an innovative alignment between the proposed framework and various meta-learning methodologies such as Model-Agnostic Meta-Learning (MAML), Conditional Neural Processes (CNP), and Latent Embedding Optimization (LEO). This alignment builds a conceptual bridge between the disciplines of meta-learning and continual learning. Particularly, meta-learning is leveraged for task inference, thus boosting the adaptability of models in dynamically evolving environments.

In their implementation, task-specific parameters are updated based on task inference mechanisms, while continual learning methods are applied to the meta-level learning components—this dual-level optimization provides a robust mechanism for learning across tasks that do not share boundaries. The application of Bayesian Gradient Descent (BGD) to stabilize learning of these meta-level parameters is highlighted as a significant contribution.

Implications and Future Directions

This research holds considerable implications both theoretically and practically. Theoretically, it proposes a new paradigm for tackling the problem of catastrophic forgetting by reframing the issue into one of rapid task recall and adaptation. Practically, it suggests a pathway for implementing lifelong learning systems capable of retaining and recalling multiple tasks without explicit task demarcations.

For future research, the application of this framework to reinforcement learning tasks in partially observable settings appears promising. Investigating these methods could pave the way for even more generalized learning systems, adaptive to complex, real-world applications where task structures are inherently unlabelled or continuously evolving.

In conclusion, this paper makes a substantive contribution to the fields of continual and meta learning by proposing a coherent framework that allows for effective learning across sequences of tasks without prior task information. This shift in perspective opens new avenues for efficiently tackling non-stationary data distributions in real-world applications, proving the potential of combining meta-learning principles with continual learning strategies.

PDF Markdown Bookmark Chat (Pro)

Authors (6)

Xu He (66 papers)
Jakub Sygnowski (13 papers)
Alexandre Galashov (21 papers)
Andrei A. Rusu (18 papers)
Yee Whye Teh (162 papers)
Razvan Pascanu (138 papers)

Citations (93)

View on Semantic Scholar

Related Papers

Find Related Papers

YouTube

Show All Videos