Riemannian Walk for Incremental Learning: Understanding Forgetting and Intransigence (1801.10112v3)

Published 30 Jan 2018 in cs.CV

Abstract: Incremental learning (IL) has received a lot of attention recently, however, the literature lacks a precise problem definition, proper evaluation settings, and metrics tailored specifically for the IL problem. One of the main objectives of this work is to fill these gaps so as to provide a common ground for better understanding of IL. The main challenge for an IL algorithm is to update the classifier whilst preserving existing knowledge. We observe that, in addition to forgetting, a known issue while preserving knowledge, IL also suffers from a problem we call intransigence, inability of a model to update its knowledge. We introduce two metrics to quantify forgetting and intransigence that allow us to understand, analyse, and gain better insights into the behaviour of IL algorithms. We present RWalk, a generalization of EWC++ (our efficient version of EWC [Kirkpatrick2016EWC]) and Path Integral [Zenke2017Continual] with a theoretically grounded KL-divergence based perspective. We provide a thorough analysis of various IL algorithms on MNIST and CIFAR-100 datasets. In these experiments, RWalk obtains superior results in terms of accuracy, and also provides a better trade-off between forgetting and intransigence.

View on arXiv

Authors (4)

Arslan Chaudhry (15 papers)
Puneet K. Dokania (44 papers)
Thalaiyasingam Ajanthan (33 papers)
Philip H. S. Torr (219 papers)

Citations (1,020)

View on Semantic Scholar

Summary

Riemannian Walk for Incremental Learning: Understanding Forgetting and Intransigence

The paper entitled "Riemannian Walk for Incremental Learning: Understanding Forgetting and Intransigence" presents an insightful exploration into the domain of Incremental Learning (IL). The authors aim to address significant gaps in the current literature, such as the absence of precise problem definitions, appropriate evaluation settings, and metrics specifically designed for IL problems.

The primary focus of Incremental Learning algorithms is to incrementally update classifiers with new information while preserving previously learned knowledge. One of the central challenges of IL is managing the trade-off between forgetting, the loss of previously acquired knowledge, and intransigence, the inability to integrate new information effectively. The authors introduce new metrics to quantify these phenomena, offering a more granular view of an IL algorithm's performance.

Key Contributions

Novel Metrics for Evaluation

The proposed metrics, Forgetting and Intransigence, are pivotal for understanding the behavioral dynamics of IL algorithms. Forgetting measures the degradation in performance on previous tasks, while Intransigence assesses an algorithm's capability to learn new tasks. These metrics are designed to reflect the trade-offs that IL algorithms must balance, given their limited capacity to retain and integrate information continually.

RWalk Algorithm

The paper introduces RWalk, a generalization that extends the concepts underlying Elastic Weight Consolidation (EWC) and Path Integral (PI) methods. RWalk incorporates three main components:

KL-divergence-based Regularization (EWC++): This is a more efficient and theoretically grounded version compared to standard EWC, leveraging a second-order approximation of the KL-divergence.
Parameter Importance Score: Building on PI, this score measures the sensitivity of the loss with respect to parameter changes in the Riemannian manifold.
Representative Sample Storage: Strategies for selecting and using representative samples from previous tasks help mitigate intransigence, thereby enhancing the learning of new tasks without forgetting old ones.

Experiments and Results

The evaluation of RWalk was conducted on MNIST and CIFAR-100 datasets, demonstrating its superiority in terms of accuracy and its ability to balance forgetting and intransigence effectively. Notably, the RWalk algorithm achieved better performance in both the multi-head and single-head evaluation settings. These settings differ in whether the task identifier is known at test time, with single-head being more challenging as it requires distinguishing among all known labels without additional cues.

In the single-head setting, RWalk outperformed baseline methods, such as EWC and PI, which highlights the fundamental importance of retaining discriminative power across tasks. Moreover, the inclusion of representative samples from prior tasks played a significant role in improving intransigence, thus enabling the model to effectively integrate new knowledge.

Implications and Future Directions

The introduction of the Riemannian Walk (RWalk) method marks a significant advance in IL, both practically and theoretically. By finding a better balance between preserving old knowledge and acquiring new information, RWalk paves the way for more robust IL systems.

Practical Implications

The practical utility of RWalk's approach to IL can be seen in various real-world applications where updating models without degradation of previous knowledge is critical. Autonomous driving systems, for instance, must constantly learn about new scenarios without forgetting previously encountered ones. In such dynamic environments, the balance between learning and remembering is paramount.

Theoretical Implications

The theoretically grounded KL-divergence perspective combined with an efficient computation strategy sets a precedent for future IL research. The exploration of Riemannian manifold-based distance measures and the efficient updating of Fisher Information Matrix through moving averages are both promising directions for enhancing IL algorithms.

Speculation on Future Developments

Looking ahead, the field of IL may benefit from exploring the integration of sparsity-based regularization techniques which could further optimize the use of model capacity. Additionally, exploration-based methods may provide new avenues for balancing the learning of new tasks against the remembering of old ones. Future work could also examine the scalability of RWalk to more complex tasks, such as semantic segmentation, where the dimensionality of the problem is significantly higher.

In summary, the "Riemannian Walk for Incremental Learning: Understanding Forgetting and Intransigence" paper offers a substantial contribution to the IL domain by addressing critical gaps with novel metrics, proposing an advanced algorithm, and validating its efficacy through extensive experiments. The insights and methodologies introduced are poised to influence future research trajectories and practical implementations in the field of AI.

PDF Markdown

Related Papers

Find Related Papers