Class-Incremental Learning: A Survey (2302.03648v2)

Published 7 Feb 2023 in cs.CV and cs.LG

Abstract: Deep models, e.g., CNNs and Vision Transformers, have achieved impressive achievements in many vision tasks in the closed world. However, novel classes emerge from time to time in our ever-changing world, requiring a learning system to acquire new knowledge continually. Class-Incremental Learning (CIL) enables the learner to incorporate the knowledge of new classes incrementally and build a universal classifier among all seen classes. Correspondingly, when directly training the model with new class instances, a fatal problem occurs -- the model tends to catastrophically forget the characteristics of former ones, and its performance drastically degrades. There have been numerous efforts to tackle catastrophic forgetting in the machine learning community. In this paper, we survey comprehensively recent advances in class-incremental learning and summarize these methods from several aspects. We also provide a rigorous and unified evaluation of 17 methods in benchmark image classification tasks to find out the characteristics of different algorithms empirically. Furthermore, we notice that the current comparison protocol ignores the influence of memory budget in model storage, which may result in unfair comparison and biased results. Hence, we advocate fair comparison by aligning the memory budget in evaluation, as well as several memory-agnostic performance measures. The source code is available at https://github.com/zhoudw-zdw/CIL_Survey/

Citations (104)

View on Semantic Scholar

Summary

The paper systematically categorizes class-incremental learning methods into data-, model-, and algorithm-centric approaches to combat catastrophic forgetting.
It details data replay strategies, architectural adaptations, and knowledge distillation techniques, highlighting trade-offs between accuracy and memory efficiency.
Experimental evaluations on benchmarks like CIFAR100 and ImageNet reveal that dynamic networks excel in accuracy at the expense of higher memory usage, steering future research.

An Overview of Deep Class-Incremental Learning: A Survey

This paper provides a comprehensive survey addressing deep class-incremental learning (CIL) methods, focusing on mitigating the significant challenge of catastrophic forgetting. In open-world scenarios where models need to continuously integrate new classes, catastrophic forgetting occurs when a model loses information learned in previous classes. The authors categorize existing methods into data-centric, model-centric, and algorithm-centric approaches, each offering distinct solutions to ensure that models maintain accuracy across all seen classes over time.

Data-Centric Approaches

Data-centric methods emphasize utilizing exemplar data from earlier tasks to combat forgetting. Among these, data replay strategies are common. These store a limited selection of previous data points and rehearse them alongside new tasks. Direct replay uses raw exemplars, while generative replay relies on generating synthetic data that mimics former classes via models like GANs. Although effective, generative replay suffers from scalability issues in complex domains.

Alternatively, data regularization methods apply constraints during the learning process to prevent the model from modifying weights vital to previous tasks. Gradient Episodic Memory (GEM), for example, ensures that the gradient of the new tasks does not increase the loss of old tasks. This subset of methods, however, faces challenges in efficiency, primarily due to complexity in optimization.

Model-Centric Approaches

Model-centric strategies deploy architectural adaptations to expand the model's capacity dynamically. Dynamic networks, such as DER, progressively expand resource-intensive components such as the backbone to include new neural resources dedicated to incoming tasks. This expansion improves adaptability by preventing overwriting, albeit at an increased memory cost. Notably, model-centric strategies are advancing further with new architectures like vision transformers (ViTs), which show potential with effective prompt-based incremental learning.

Parameter regularization methods estimate the importance of model parameters to preserve crucial knowledge across tasks. EWC, a classic example, uses Fisher information to guide which parameters should be protected, but its impact is often limited due to memory constraints and potential conflicts between tasks.

Algorithm-Centric Approaches

Algorithm-centric CIL focuses on refining learning protocols to maintain knowledge. A predominant technique is knowledge distillation, which distills the essence of the previous models into the current one by aligning outputs or model features. This method is versatile, applicable across logits, features, and relational representations, enhancing performance in retaining prior knowledge. The paper also highlights the potential of model rectification, which identifies and corrects model biases due to imbalanced data encountered during incremental updates.

Experimental Evaluation and Implications

The survey rigorously evaluates numerous CIL techniques across benchmark datasets like CIFAR100 and ImageNet, revealing patterns in method efficacy concerning memory budgets. The results indicate that while dynamic networks achieve superior accuracy, they do so at the expense of higher memory usage. Conversely, knowledge distillation shows competitive performance with more modest resource requirements.

The authors advocate for fair comparisons through memory alignment, providing a comprehensive perspective on trade-offs between accuracy and resource utilization. This aligns with the survey’s proposition of a memory-agnostic evaluation metric, encouraging researchers to design adaptable CIL methods suitable for varying computational environments.

Conclusion and Future Directions

This survey not only consolidates knowledge on current CIL techniques but also suggests future research paths in areas such as combining CIL with complex data streams, ensuring methods are compatible across various deployment scenarios and leveraging pre-trained models effectively. The insights gained from this analysis underscore the necessity of holistic approaches that blend multiple methodologies to manage the challenges of class-incremental learning.

By demarcating the progress and gaps in the field, this paper offers a valuable resource for practitioners and researchers pursuing robust methodologies to counteract catastrophic forgetting and adapt models to ever-evolving data landscapes. As AI systems continually integrate into complex real-world applications, the demand for resilient learning techniques such as these will only escalate.

PDF Markdown

Related Papers

GitHub

GitHub - zhoudw-zdw/CIL_Survey: The code repository for "Deep Class-Incremental Learning: A Survey" in PyTorch. (258 stars)