- The paper introduces a two-phase methodology where the progress phase learns new tasks with feature reuse and the compress phase consolidates knowledge via online EWC.
- The paper demonstrates reduced catastrophic forgetting and improved performance over standard fine-tuning in experiments on Omniglot, Atari, and 3D maze navigation.
- The paper shows that maintaining a fixed parameter count enhances scalability and promotes efficient forward transfer across sequential learning tasks.
Progress & Compress: A Scalable Framework for Continual Learning
The paper "Progress & Compress: A scalable framework for continual learning" by Jonathan Schwarz et al. addresses a critical issue in machine learning, particularly in domains where tasks are learned sequentially. Continual learning, defined as the ability to learn new tasks without forgetting previous knowledge, presents unique challenges in both supervised and reinforcement learning (RL) contexts. The proposed framework, dubbed Progress & Compress (P&C), offers a novel solution to these challenges through an architecture that maintains a constant number of parameters and avoids catastrophic forgetting while facilitating positive transfer.
Methodology
The core structure of P&C consists of two neural components: a knowledge base and an active column. The knowledge base retains previously learned knowledge, and the active column is dedicated to learning new tasks. The framework operates in two distinct phases:
- Progress Phase: In this phase, the active column learns new tasks by adapting its parameters. The knowledge base facilitates this learning through layerwise connections that allow feature reuse, similar to Progressive Networks. This connection encourages positive transfer but introduces no architectural growth.
- Compress Phase: Here, the information learned by the active column is distilled into the knowledge base, ensuring that previous knowledge is preserved. This is achieved through a modified Elastic Weight Consolidation (EWC) approach. EWC mitigates forgetting by constraining the parameters to prevent excessive deviation from previously learned tasks. Unlike traditional EWC, which is not scalable due to its linear growth in regularization terms, the P&C framework employs an "online EWC" that addresses scalability issues by maintaining a constant computational cost.
Experimental Validation
The authors validate P&C across multiple domains:
- Omniglot: Used to assess continual learning in a sequential classification task across 50 handwritten alphabets. Each alphabet is treated as a distinct task. The results indicate that P&C successfully preserves performance on previously learned alphabets, showing a comparable improvement over naive fine-tuning methods and other regularization approaches like basic EWC and Learning Without Forgetting (LwF).
- Atari Games: A benchmark inside RL involving the sequential learning of six games ("Space Invaders," "Krull," "Beamrider," "Hero," "Stargunner," and "Ms. Pac-man"). The results demonstrate P&C's ability to maintain high performance across tasks, effectively mitigating negative transfer and catastrophic forgetting.
- 3D Maze Navigation: Assessed the framework's scalability and forward transfer capabilities across eight similar navigation tasks. The framework enabled faster learning and retained high performance even as tasks increased, highlighting its utility in scalable scenarios.
Key Findings
- Positive Transfer: P&C enhances both generalization and data efficiency when learning new tasks. The active column leverages past knowledge from the knowledge base, which empirically accelerates the learning process in domains such as maze navigation, where task similarities are high.
- Mitigation of Catastrophic Forgetting: The EWC-modified compress phase ensures learned tasks are not forgotten. Particularly in Atari games, P&C shows substantial improvements over standard methods like fine-tuning and traditional EWC.
- Scalability: P&C maintains a fixed number of parameters, making it contextually appropriate for environments with a large number of sequential tasks. The complexity does not grow with the number of tasks, ensuring computational feasibility.
Discussion and Future Directions
The proposed P&C framework strikes a balance between retaining old knowledge and efficiently learning new tasks. It successfully unifies aspects of existing approaches, such as Progressive Networks and EWC, while addressing their limitations. This framework opens pathways for further refinement, particularly in dealing with more complex task distributions or environments where task boundaries are less distinct.
Potential future developments may explore:
- Graduated Drift: Enhancing the framework to adapt smoothly to gradual changes in task distributions without explicit task boundaries.
- Task-Specific Adaptation: More refined mechanisms for task-specific adaptations without the need for parameter growth.
- Hybrid Models: Combining P&C with generative replay mechanisms to further enhance knowledge retention and transfer capabilities.
In sum, the P&C framework represents a significant advancement in the field of continual learning, providing a scalable and efficient solution to one of machine learning's persistent challenges. By maintaining performance on previously learned tasks and accelerating the learning of subsequent tasks, P&C offers a robust methodology applicable across diverse learning contexts.