Online Continual Learning with Maximally Interfered Retrieval (1908.04742v3)

Published 11 Aug 2019 in cs.LG and stat.ML

Abstract: Continual learning, the setting where a learning agent is faced with a never ending stream of data, continues to be a great challenge for modern machine learning systems. In particular the online or "single-pass through the data" setting has gained attention recently as a natural setting that is difficult to tackle. Methods based on replay, either generative or from a stored memory, have been shown to be effective approaches for continual learning, matching or exceeding the state of the art in a number of standard benchmarks. These approaches typically rely on randomly selecting samples from the replay memory or from a generative model, which is suboptimal. In this work, we consider a controlled sampling of memories for replay. We retrieve the samples which are most interfered, i.e. whose prediction will be most negatively impacted by the foreseen parameters update. We show a formulation for this sampling criterion in both the generative replay and the experience replay setting, producing consistent gains in performance and greatly reduced forgetting. We release an implementation of our method at https://github.com/optimass/Maximally_Interfered_Retrieval.

PDF Abstract

Online Continual Learning with Maximally Interfered Retrieval

The paper "Online Continual Learning with Maximally Interfered Retrieval" addresses the challenge of continual learning, where a model is exposed to a continuous, non-i.i.d. stream of data and must learn without forgetting previously acquired knowledge. This issue, known as catastrophic forgetting, is significant in neural networks. The paper introduces a novel approach using Maximally Interfered Retrieval (MIR) to enhance the performance of replay-based continual learning methods.

Key Contributions and Methodology

The primary innovation of the paper is the implementation of MIR, a strategy to select samples for replay that are most likely to suffer from increased loss post-model update. The authors propose that instead of random sample selection, strategically choosing interfered samples reduces forgetting and improves model accuracy.

Maximally Interfered Retrieval (MIR): MIR targets samples whose loss would increase due to updates from new data. By focusing replay on these samples, the model can efficiently consolidate past knowledge with minimal computational resources.
Implementation Techniques: The research explores two scenarios:
- Experience Replay (ER): Here, a memory buffer stores a subset of past samples. MIR selectively uses samples from this buffer, outperforming standard ER in the MNIST Split and CIFAR-10 settings.
- Generative Replay: A generative model produces samples. MIR is applied to choose interfered samples in the latent space, improving over baseline generative replay methods.
Hybrid Approach: A combination of storage and generative methods is suggested, where an autoencoder compresses incoming data, making MIR search in latent space efficient. This hybrid model retains more information compared to raw sample storage, and proves effective on CIFAR-10 dataset tests.

Results and Performance

Numerical Results: For the MNIST Split, MIR-based ER achieves a significant improvement over standard ER, with higher accuracy and reduced forgetting. Similar gains are seen in more challenging scenarios like CIFAR-10.
Generative Replay Improvements: On MNIST and Permuted MNIST datasets, MIR significantly improves both classifier performance and generative model stability.
Hybrid Model Success: The hybrid strategy attains higher task accuracy and lower forgetting, leveraging compressed latent representations.

Implications and Future Directions

The methodology proposed by the authors presents significant implications for real-time applications where learning systems must adapt continuously without revisiting prior data explicitly, such as in robotics or adaptive user interfaces.

Theoretically, the approach underscores the importance of strategic sample selection in continual learning contexts. MIR's effective application indicates that understanding sample interference can provide valuable insights into memory efficiency and tasks prioritization.

Future work could explore MIR's integration with other forgetting mitigation methods or its adaptability to other machine learning paradigms beyond classification. Examining its scalability and performance on larger, more diverse datasets could further solidify its practical applicability.

In summary, the introduction of Maximally Interfered Retrieval offers a promising direction in overcoming challenges inherent to online continual learning, showcasing improvements in memory utilization and knowledge retention.

PDF Markdown Bookmark Chat (Pro)

Authors (7)

Rahaf Aljundi (33 papers)
Lucas Caccia (22 papers)
Eugene Belilovsky (68 papers)
Massimo Caccia (28 papers)
Min Lin (96 papers)
Laurent Charlin (51 papers)
Tinne Tuytelaars (150 papers)

Citations (477)

View on Semantic Scholar

Related Papers

Find Related Papers

GitHub

GitHub - optimass/Maximally_Interfered_Retrieval: Codebase for "Online Continual Learning with Maximally Interfered Retrieval" (102 stars)