Deep Online Learning via Meta-Learning: Continual Adaptation for Model-Based RL (1812.07671v2)

Published 18 Dec 2018 in cs.LG, cs.AI, cs.RO, and stat.ML

Abstract: Humans and animals can learn complex predictive models that allow them to accurately and reliably reason about real-world phenomena, and they can adapt such models extremely quickly in the face of unexpected changes. Deep neural network models allow us to represent very complex functions, but lack this capacity for rapid online adaptation. The goal in this paper is to develop a method for continual online learning from an incoming stream of data, using deep neural network models. We formulate an online learning procedure that uses stochastic gradient descent to update model parameters, and an expectation maximization algorithm with a Chinese restaurant process prior to develop and maintain a mixture of models to handle non-stationary task distributions. This allows for all models to be adapted as necessary, with new models instantiated for task changes and old models recalled when previously seen tasks are encountered again. Furthermore, we observe that meta-learning can be used to meta-train a model such that this direct online adaptation with SGD is effective, which is otherwise not the case for large function approximators. In this work, we apply our meta-learning for online learning (MOLe) approach to model-based reinforcement learning, where adapting the predictive model is critical for control; we demonstrate that MOLe outperforms alternative prior methods, and enables effective continuous adaptation in non-stationary task distributions such as varying terrains, motor failures, and unexpected disturbances.

Authors (3)

Anusha Nagabandi (10 papers)
Chelsea Finn (264 papers)
Sergey Levine (531 papers)

Citations (189)

View on Semantic Scholar

Summary

Overview of Deep Online Learning via Meta-Learning for Continual Adaptation in Model-Based RL

The paper presents a method for deep online learning in the context of non-stationary task distributions using deep neural networks. Traditional deep neural network models, while capable of complex function representation, typically lack the capacity for rapid adaptation necessary for real-world, dynamic environments. The authors propose a novel approach, blending meta-learning with model-based reinforcement learning (RL), to enable continual online adaptation. Their approach, termed Meta-Learning for Online Learning (MOLe), aims to outperform existing strategies in scenarios where tasks vary dynamically over time such as shifting terrains and unexpected motor failures.

Methodology

The core of the MOLe algorithm is its innovative method of continuously adapting a mixture model over neural network parameters using stochastic gradient descent (SGD). This mixture model approach utilizes expectation maximization along with a Chinese restaurant process prior, allowing the system to dynamically instantiate new models as tasks evolve and reincorporate old models when previously encountered tasks reappear.

Key aspects of the methodology include:

Stochastic Gradient Descent Online Adaptation: Parameters are adjusted in real-time using SGD, a standard tool for optimization over continuous streams of data.
Mixture Model and Task Identification: Multiple models are trained to handle different tasks, with expectations maximized using computed probabilities for task presence. The system autonomously decides task boundaries and necessary adaptations.
Meta-Learning for Prior Initialization: Utilization of Model-Agnostic Meta-Learning (MAML) provides a strong initial configuration of model parameters, ensuring effective few-shot learning. This meta-trained prior acts as a foundational step enabling rapid adaptation from limited data.

Results and Implications

The MOLe framework demonstrates significant improvements in model-based RL experimentations using a suite of simulated robotic tasks. Specifically, MOLe showed superior performance in terms of continuous adaptation across variable task distributions compared to baseline methods. Noteworthy achievements of MOLe include:

Tasks such as traversing variable terrains or dealing with system malfunctions manifest strong adaptation and recall abilities, with MOLe successfully managing abrupt and gradual task switches.
Effective expansion of neural network models beyond their initial distributions, addressing widely different and out-of-distribution tasks which traditional methods struggle with.
The self-organizing nature of MOLe, lacking predefined task categorizations, emphasizes its flexibility and resilience within ambiguous environments, aiding autonomous systems in complex, real-world dynamics.

Future Directions

The MOLe algorithm extends the capabilities of meta-learning and model-based reinforcement learning, foregrounding a new paradigm in adaptive AI systems. Future research might involve refining the meta-learning initialization technique or pursuing applications in domains like time series analysis and real-time data processing. These advancements could broaden the impact of MOLe across diverse industries, contributing to more responsive and intelligent systems. Thus, MOLe represents a sophisticated blending of deep learning strategies to achieve nimble adaptability in RL environments that mimic the complexities seen in human and animal behavior.

PDF Markdown

Related Papers

Find Related Papers