Defying Imbalanced Forgetting in Class Incremental Learning (2403.14910v1)
Abstract: We observe a high level of imbalance in the accuracy of different classes in the same old task for the first time. This intriguing phenomenon, discovered in replay-based Class Incremental Learning (CIL), highlights the imbalanced forgetting of learned classes, as their accuracy is similar before the occurrence of catastrophic forgetting. This discovery remains previously unidentified due to the reliance on average incremental accuracy as the measurement for CIL, which assumes that the accuracy of classes within the same task is similar. However, this assumption is invalid in the face of catastrophic forgetting. Further empirical studies indicate that this imbalanced forgetting is caused by conflicts in representation between semantically similar old and new classes. These conflicts are rooted in the data imbalance present in replay-based CIL methods. Building on these insights, we propose CLass-Aware Disentanglement (CLAD) to predict the old classes that are more likely to be forgotten and enhance their accuracy. Importantly, CLAD can be seamlessly integrated into existing CIL methods. Extensive experiments demonstrate that CLAD consistently improves current replay-based methods, resulting in performance gains of up to 2.56%.
- Class-Incremental Learning with Cross-Space Clustering and Controlled Transfer. ArXiv, abs/2208.03767.
- Multimodal Machine Learning: A Survey and Taxonomy. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41: 423–443.
- Dark experience for general continual learning: a strong, simple baseline. Advances in Neural Information Processing Systems, 33: 15920–15930.
- Efficient lifelong learning with a-gem. arXiv preprint arXiv:1812.00420.
- Is forgetting less a good inductive bias for forward transfer? ArXiv, abs/2303.08207.
- Deep Image Retrieval: A Survey. ArXiv, abs/2101.11282.
- Flattening Sharpness for Dynamic Gradient Projection Memory Benefits Continual Learning. Advances in Neural Information Processing Systems, 34: 18710–18721.
- Imagenet: A large-scale hierarchical image database. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.
- Podnet: Pooled outputs distillation for small-tasks incremental learning. In EProceedings of the European Conference on Computer Vision (ECCV), 86–102. Springer.
- Orthogonal gradient descent for continual learning. In International Conference on Artificial Intelligence and Statistics, 3762–3773. PMLR.
- A tale of two cils: The connections between class incremental learning and class imbalanced learning, and beyond. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 3559–3569.
- Deep residual learning for image recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 770–778.
- Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531, 2(7).
- Learning a unified classifier incrementally via rebalancing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 831–839.
- Distilling causal effect of data in class-incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 3957–3966.
- Krizhevsky, A.; et al. 2009. Learning multiple layers of features from tiny images. In Technical Report.
- Learning without forgetting. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(12): 2935–2947.
- Adaptive aggregation networks for class-incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2544–2553.
- Mnemonics training: Multi-class incremental learning without forgetting. In Proceedings of the IEEE/CVF conference on Computer Vision and Pattern Recognition, 12245–12254.
- Gradient Episodic Memory for Continual Learning. In Advances in Neural Information Processing Systems.
- Catastrophic interference in connectionist networks: The sequential learning problem. In Psychology of learning and motivation, volume 24, 109–165. Elsevier.
- GDumb: A Simple Approach that Questions Our Progress in Continual Learning. In Proceedings of the European Conference on Computer Vision (ECCV).
- Anatomy of catastrophic forgetting: Hidden representations and task semantics. arXiv preprint arXiv:2007.07400.
- icarl: Incremental classifier and representation learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2001–2010.
- Learning to learn without forgetting by maximizing transfer and minimizing interference. arXiv preprint arXiv:1810.11910.
- Robins, A. 1995. Catastrophic forgetting, rehearsal and pseudorehearsal. Connection Science, 7(2): 123–146.
- Gradient Projection Memory for Continual Learning. ArXiv, abs/2103.09762.
- Mimicking the Oracle: An Initial Phase Decorrelation Approach for Class Incremental Learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 16701–16710.
- Three scenarios for continual learning. arXiv preprint arXiv:1904.07734.
- Large scale incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 374–382.
- Der: Dynamically expandable representation for class incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 3014–3023.
- Maintaining discrimination and fairness in class incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 13208–13217.
- Shixiong Xu (4 papers)
- Gaofeng Meng (41 papers)
- Xing Nie (5 papers)
- Bolin Ni (11 papers)
- Bin Fan (40 papers)
- Shiming Xiang (54 papers)