Functional Regularisation for Continual Learning with Gaussian Processes (1901.11356v4)

Published 31 Jan 2019 in stat.ML and cs.LG

Abstract: We introduce a framework for Continual Learning (CL) based on Bayesian inference over the function space rather than the parameters of a deep neural network. This method, referred to as functional regularisation for Continual Learning, avoids forgetting a previous task by constructing and memorising an approximate posterior belief over the underlying task-specific function. To achieve this we rely on a Gaussian process obtained by treating the weights of the last layer of a neural network as random and Gaussian distributed. Then, the training algorithm sequentially encounters tasks and constructs posterior beliefs over the task-specific functions by using inducing point sparse Gaussian process methods. At each step a new task is first learnt and then a summary is constructed consisting of (i) inducing inputs -- a fixed-size subset of the task inputs selected such that it optimally represents the task -- and (ii) a posterior distribution over the function values at these inputs. This summary then regularises learning of future tasks, through Kullback-Leibler regularisation terms. Our method thus unites approaches focused on (pseudo-)rehearsal with those derived from a sequential Bayesian inference perspective in a principled way, leading to strong results on accepted benchmarks.

Authors (5)

Michalis K. Titsias (39 papers)
Jonathan Schwarz (12 papers)
Alexander G. de G. Matthews (10 papers)
Razvan Pascanu (138 papers)
Yee Whye Teh (162 papers)

Citations (171)

View on Semantic Scholar

Summary

The paper presents a novel method that leverages Gaussian processes to perform functional regularisation, preventing catastrophic forgetting in sequential task learning.
It employs inducing point sparse GP methodology to summarize each task with an approximate posterior over the function space of neural network outputs.
Experiments on benchmarks like Permuted MNIST, Split MNIST, and Omniglot demonstrate competitive accuracy with efficient task summarisation using minimal inducing points.

Functional Regularisation for Continual Learning with Gaussian Processes

The paper "Functional Regularisation for Continual Learning with Gaussian Processes," authored by Michalis K. Titsias et al., introduces a novel approach to continual learning (CL) that leverages Bayesian inference within the function space, rather than focusing solely on the parameter space of deep neural networks. This methodology is distinctively referred to as functional regularisation for Continual Learning.

Overview and Methodology

Continual learning involves developing systems that learn new tasks sequentially without substantial retraining on previously acquired data. A key challenge in CL is mitigating catastrophic forgetting, where learning a new task leads to the loss of information regarding previously learned tasks. The proposed framework achieves this by maintaining and updating an approximate posterior belief over the function specific to each task.

The researchers employ Gaussian processes (GPs) to construct and preserve this belief system. By treating the last layer weights of a neural network as Gaussian-distributed variables, the method leverages inducing point sparse GP methodologies. For every encountered task, a summary is crafted consisting of the inducing inputs—a fixed-size representative subset—and the posterior distribution of function values at these inputs. This summary regulates future task learning through Kullback-Leibler regularisation, streamlining traditional approaches such as pseudo-rehearsal and sequential Bayesian inference.

Numerical Results and Claims

The paper presents robust results across several standard benchmarks such as Permuted MNIST, Split MNIST, and Omniglot. The outcomes demonstrate the competitive edge of functional regularisation over other state-of-the-art continual learning methods. Notably, the proposed approach achieves high accuracy with minimal inducing points per task, showcasing its efficiency in data compression and task summarisation.

Theoretical and Practical Implications

The theoretical contribution of the paper lies in its novel application of Bayesian inference in the function space for CL, circumventing the brittleness associated with parameter drift in neural networks. Practically, the method enhances scalability in handling sequential tasks, improving over naive replay-based methods by capturing uncertainty in function estimations. This refined replay mechanism offers a concerted advantage by using inducing points to capture and compress previous task data efficiently.

Future research directions may include expanding this methodology to various domains, including reinforcement learning, and exploring alternative function space representations beyond Gaussian processes to further enrich the model's adaptability and precision.

Conclusion

In summary, this research paper advances the field of continual learning by introducing a principled, function-based regularisation process that effectively integrates concepts from Bayesian inference and sparse Gaussian processes. This integration not only mitigates catastrophic forgetting but also elevates the scalability and efficiency of continual learning systems. As AI continues to evolve, such innovative approaches promise improved learning architectures capable of adapting seamlessly across a succession of diverse tasks.

PDF Markdown

Related Papers

YouTube

Show All Videos