FOCIL: Finetune-and-Freeze for Online Class Incremental Learning by Training Randomly Pruned Sparse Experts (2403.14684v1)
Abstract: Class incremental learning (CIL) in an online continual learning setting strives to acquire knowledge on a series of novel classes from a data stream, using each data point only once for training. This is more realistic compared to offline modes, where it is assumed that all data from novel class(es) is readily available. Current online CIL approaches store a subset of the previous data which creates heavy overhead costs in terms of both memory and computation, as well as privacy issues. In this paper, we propose a new online CIL approach called FOCIL. It fine-tunes the main architecture continually by training a randomly pruned sparse subnetwork for each task. Then, it freezes the trained connections to prevent forgetting. FOCIL also determines the sparsity level and learning rate per task adaptively and ensures (almost) zero forgetting across all tasks without storing any replay data. Experimental results on 10-Task CIFAR100, 20-Task CIFAR100, and 100-Task TinyImagenet, demonstrate that our method outperforms the SOTA by a large margin. The code is publicly available at https://github.com/muratonuryildirim/FOCIL.
- Expert gate: Lifelong learning with a network of experts. In CVPR, 2017.
- Memory aware synapses: Learning what (not) to forget. In ECCV, 2018.
- Online continual learning with maximal interfered retrieval. In NeurIPS, 2019a.
- Gradient based sample selection for online continual learning. In NeurIPS, 2019b.
- Learning fast, learning slow: A general continual learning method based on complementary learning system. In ICLR, 2022.
- Sparseness and expansion in sensory representations. Neuron, 83(5), 2014.
- Rainbow memory: Continual learning with a memory of diverse samples. In CVPR, 2021.
- Random search for hyper-parameter optimization. Journal of machine learning research, 13(2), 2012.
- Algorithms for hyper-parameter optimization. In NeurIPS, 2011.
- Riemannian walk for incremental learning: Understanding forgetting and intransigence. In ECCV, 2018.
- Efficient lifelong learning with a-gem. In ICLR, 2019a.
- On tiny episodic memories in continual learning. arXiv:1902.10486, 2019b.
- Continual prototype evolution: Learning online from non-stationary data streams. In ICCV, 2021.
- Continual prune-and-select: class-incremental learning with specialized subnetworks. Applied Intelligence, 2023.
- Dytox: Transformers for continual learning with dynamic token expansion. In CVPR, 2022.
- Rigging the lottery: Making all tickets winners. In ICML, 2020.
- The lottery ticket hypothesis: Finding sparse, trainable neural networks. In ICLR, 2019.
- Why random pruning is all we need to start sparse. In ICML, 2023.
- The state of sparsity in deep neural networks. arXiv:1902.09574, 2019.
- Continual learning via neural pruning. In NeurIPS, 2019.
- Not just selection, but exploration: Online class-incremental continual learning via dual view consistency. In CVPR, 2022.
- Adaptive orthogonal projection for batch and online continual learning. In AAAI, 2022a.
- Online continual learning through mutual information maximization. In ICML, 2022b.
- Nispa: Neuro-inspired stability-plasticity adaptation for continual learning in sparse networks. In ICML, 2022.
- Exemplar-supported generative reproduction for class incremental learning. In BMVC, 2018.
- Deep residual learning for image recognition. In CVPR, 2016.
- Channel pruning for accelerating very deep neural networks. In ICCV, 2017.
- Lifelong learning via progressive distillation and retrospection. In ECCV, 2018.
- Overcoming catastrophic forgetting for continual learning via model adaptation. In ICLR, 2019.
- Automated machine learning: methods, systems, challenges. Springer Nature, 2019.
- Forget-free continual learning with winning subnetworks. In ICML, 2022a.
- On the soft-subnetwork for few-shot class incremental learning. In ICLR, 2023.
- Class-incremental learning by knowledge distillation with adaptive feature consolidation. In CVPR, 2022b.
- Kirkpatrick, J. et al. Overcoming catastrophic forgetting in neural networks. PNAS, 2017.
- Learning multiple layers of features from tiny images. 2009.
- Tiny imagenet visual recognition challenge. 2015.
- Mnist handwritten digit database. ATT Labs, 2, 2010.
- Continual learning with extended kronecker-factored approximate curvature. In CVPR, 2020.
- Snip: Single-shot network pruning based on connection sensitivity. In ICLR, 2019.
- Massively parallel hyperparameter tuning. arXiv:1810.05934, 2018.
- Learning without forgetting. TPAMI, 2017.
- Topological insights into sparse neural networks. In ECML PKDD, 2020.
- The unreasonable effectiveness of random pruning: Return of the most naive baseline for sparse training. In ICLR, 2022.
- Adaptive aggregation networks for class-incremental learning. In CVPR, 2021a.
- Rmm: Reinforced memory management for class-incremental learning. In NeurIPS, 2021b.
- Gradient episodic memory for continual learning. In NeurIPS, 2017.
- Supervised contrastive replay: Revisiting the nearest class mean classifier in online class-incremental continual learning. In CVPR Workshops, 2021.
- Packnet: Adding multiple tasks to a single network by iterative pruning. In CVPR, 2018.
- Piggyback: Adapting a single network to multiple tasks by learning to mask weights. In ECCV, 2018.
- Diversity networks: Neural network compression using determinantal point processes. arXiv:1511.05077, 2015.
- Catastrophic interference in connectionist networks: The sequential learning problem. In Psychology of learning and motivation. Elsevier, 1989.
- Wide neural networks forget less catastrophically. In ICML, 2022a.
- Architecture matters in continual learning. arXiv:2202.00275, 2022b.
- A topological insight into restricted boltzmann machines. Machine Learning, 104:243–270, 2016a.
- Online contrastive divergence with generative replay: Experience replay without storing data. arXiv:1610.05555, 2016b.
- Mocanu, D. C. et al. Scalable training of artificial neural networks with adaptive sparse connectivity inspired by network science. Nature communications, 2018.
- Fetril: Feature translation for exemplar-free class-incremental learning. In WACV, 2023.
- Dualnet: Continual learning, fast and slow. In NeurIPS, 2021.
- Gdumb: A simple approach that questions our progress in continual learning. In ECCV, 2020.
- itaml: An incremental task-agnostic meta-learning approach. In CVPR, 2020.
- icarl: Incremental classifier and representation learning. In CVPR, 2017.
- Error sensitivity modulation based experience replay: Mitigating abrupt representation drift in continual learning. In ICLR, 2023.
- Online class-incremental continual learning with adversarial shapley value. In AAAI, 2021.
- Continual learning with deep generative replay. In NeurIPS, 2017.
- Spacenet: Make free space for continual learning. Neurocomputing, 2021.
- Avoiding forgetting and allowing forward transfer in continual learning via sparse networks. In ECML PKDD, 2023.
- Pruning neural networks without any data by iteratively conserving synaptic flow. In NeurIPS, 2020.
- Three scenarios for continual learning. arXiv:1904.07734, 2019.
- Picking winning tickets before training by preserving gradient flow. In ICLR, 2020.
- Foster: Feature boosting and compression for class-incremental learning. In ECCV, 2022a.
- Task difficulty aware parameter allocation & regularization for lifelong learning. In CVPR, 2023.
- Sparcl: Sparse continual learning on the edge. In NeurIPS, 2022b.
- Online prototype learning for online continual learning. In ICCV, 2023.
- Supermasks in superposition. In NeurIPS, 2020.
- Large scale incremental learning. In CVPR, 2019.
- Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms, 2017.
- Der: Dynamically expandable representation for class incremental learning. In CVPR, 2021.
- Continual learning with dynamic sparse training: Exploring algorithms for effective model updates. In CPAL, 2023.
- Continual learning through synaptic intelligence. In ICML, 2017.
- Maintaining discrimination and fairness in class incremental learning. In CVPR, 2020.
- A model or 603 exemplars: Towards memory-efficient class-incremental learning. In ICLR, 2023.
- Murat Onur Yildirim (6 papers)
- Elif Ceren Gok Yildirim (5 papers)
- Decebal Constantin Mocanu (52 papers)
- Joaquin Vanschoren (68 papers)