Efficient Expansion and Gradient Based Task Inference for Replay Free Incremental Learning (2312.01188v1)
Abstract: This paper proposes a simple but highly efficient expansion-based model for continual learning. The recent feature transformation, masking and factorization-based methods are efficient, but they grow the model only over the global or shared parameter. Therefore, these approaches do not fully utilize the previously learned information because the same task-specific parameter forgets the earlier knowledge. Thus, these approaches show limited transfer learning ability. Moreover, most of these models have constant parameter growth for all tasks, irrespective of the task complexity. Our work proposes a simple filter and channel expansion based method that grows the model over the previous task parameters and not just over the global parameter. Therefore, it fully utilizes all the previously learned information without forgetting, which results in better knowledge transfer. The growth rate in our proposed model is a function of task complexity; therefore for a simple task, the model has a smaller parameter growth while for complex tasks, the model requires more parameters to adapt to the current task. Recent expansion based models show promising results for task incremental learning (TIL). However, for class incremental learning (CIL), prediction of task id is a crucial challenge; hence, their results degrade rapidly as the number of tasks increase. In this work, we propose a robust task prediction method that leverages entropy weighted data augmentations and the models gradient using pseudo labels. We evaluate our model on various datasets and architectures in the TIL, CIL and generative continual learning settings. The proposed approach shows state-of-the-art results in all these settings. Our extensive ablation studies show the efficacy of the proposed components.
- Memory Aware Synapses: Learning What (not) to Forget. European Conference on Computer Vision, 2018.
- Deep batch active learning by diverse, uncertain gradient lower bounds. In ICLR, 2020.
- End-to-end incremental learning. In Proceedings of the European Conference on Computer Vision (ECCV), pages 233–248, 2018.
- Riemannian Walk for Incremental Learning: Understanding Forgetting and Intransigence. European Conference on Computer Vision, 2018.
- ImageNet: A Large-Scale Hierarchical Image Database. Computer Vision and Pattern Recognition, 2009.
- Podnet: Pooled outputs distillation for small-tasks incremental learning. In ECCV, 2020.
- Efficient architecture search for continual learning. arXiv preprint arXiv:2006.04027, 2020.
- Generative Adversarial Nets. Neural information processing systems, 2014.
- Deep Residual Learning for Image Recognition. Computer Vision and Pattern Recognition, 2016.
- Flat minima. Neural computation, 9(1):1–42, 1997.
- John J Hopfield. Neural networks and physical systems with emergent collective computational abilities. Proceedings of the national academy of sciences, pages 2554–2558, 1982.
- Compacting, Picking and Growing for Unforgetting Continual Learning. NeurIPS, 2019.
- A theoretical study on solving continual learning. In Alice H. Oh, Alekh Agarwal, Danielle Belgrave, and Kyunghyun Cho, editors, Advances in Neural Information Processing Systems, 2022.
- Adam: A method for stochastic optimization. In International Conference on Learning Representations (ICLR), 2015.
- Overcoming Catastrophic Forgetting in Neural Networks. Proceedings of the National Academy of Sciences, 2017.
- Alex Krizhevsky. Learning Multiple Layers of Features from Tiny Images. 2009.
- Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, pages 1097–1105, 2012.
- Nonparametric Bayesian Structure Adaptation for Continual Learning. arXiv preprint arXiv:1912.03624, 2019.
- Tiny imagenet visual recognition challenge. CS 231N, 7(7):3, 2015.
- A Neural Dirichlet Process Mixture Model for Task-Free Continual Learning. ICLR, 2020.
- Overcoming catastrophic forgetting by incremental moment matching. In Advances in neural information processing systems, pages 4652–4662, 2017.
- Learning without Forgetting. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017.
- Generative feature replay for class-incremental learning. In CVPR Workshops, pages 226–227, 2020.
- Piggyback: Adapting a single network to multiple tasks by learning to mask weights. In Proceedings of the European Conference on Computer Vision (ECCV), pages 67–82, 2018.
- Packnet: Adding multiple tasks to a single network by iterative pruning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 7765–7773, 2018.
- Ternary feature masks: continual learning without any forgetting. CVPR-Workshop, 2021.
- Continual Learning using a Bayesian Nonparametric Dictionary of Weight Factors. Artificial Intelligence and Statistics, 2021.
- Reading digits in natural images with unsupervised feature learning. NeurIPS Workshop on Deep Learning and Unsupervised Feature Learning, 2011.
- Continual Lifelong Learning with Neural Networks: A Review. Neural Networks, 2019.
- Fetril: Feature translation for exemplar-free class-incremental learning. WACV, 2023.
- Random Path Selection for Continual Learning. Neural Information Processing Systems, 2019.
- itaml: An incremental task-agnostic meta-learning approach. arXiv preprint arXiv:2003.11652, 2020.
- iCaRL: Incremental Classifier and Representation Learning. Computer Vision and Pattern Recognition, 2017.
- Experience Replay for Continual Learning. Neural Information Processing Systems, 2019.
- Progressive Neural Networks. arXiv preprint arXiv:1606.04671, 2016.
- Overcoming catastrophic forgetting with hard attention to the task. In International Conference on Machine Learning, pages 4548–4557, 2018.
- Continual Learning with Deep Generative Replay. Neural Information Processing Systems, 2017.
- Mastering the game of go without human knowledge. nature, 550(7676):354–359, 2017.
- Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv preprint arXiv:1409.1556, 2014.
- Calibrating cnns for lifelong learning. Advances in Neural Information Processing Systems, 33, 2020.
- Dataset knowledge transfer for class-incremental learning without memory. In WACV, 2022.
- Brain-inspired replay for continual learning with artificial neural networks. Nature communications, pages 1–14, 2020.
- Gido M van de Ven and Andreas S Tolias. Generative replay with feedback connections as a general strategy for continual learning. arXiv preprint arXiv:1809.10635, 2018.
- Efficient feature transformations for discriminative and generative continual learning. In CVPR, pages 13865–13875, 2021.
- Continual learning with hypernetworks. In International Conference on Learning Representations, 2020.
- Caltech-UCSD Birds 200. Technical Report CNS-TR-2010-001, California Institute of Technology, 2010.
- Supermasks in superposition. Advances in Neural Information Processing Systems, 33, 2020.
- Memory Replay GANs: Learning to Generate New Categories without forgetting. Neural Information Processing Systems, pages 5962–5972, 2018.
- Reinforced continual learning. In Advances in Neural Information Processing Systems, pages 899–908, 2018.
- Der: Dynamically expandable representation for class incremental learning. In CVPR, pages 3014–3023, 2021.
- Scalable and Order-robust Continual Learning with Additive Parameter Decomposition. International Conference on Learning Representations, abs/1902.09432, 2020.
- Lifelong Learning with Dynamically Expandable Networks. International Conference on Learning Representations, 2018.
- LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop. arXiv preprint arXiv:1506.03365, 2015.
- Semantic drift compensation for class-incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6982–6991, 2020.
- Continual Learning Through Synaptic Intelligence. International Conference on Machine Learning, 2017.
- StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks. Transactions on Pattern Analysis and Machine Intelligence, 2018.
- Class-Incremental Learning via Deep Model Consolidation. Winter Conference on Applications of Computer Vision, 2020.
- Side-tuning: Network adaptation via additive side networks. arXiv preprint arXiv:1912.13503, 2019.
- Penalizing gradient norm for efficiently improving generalization in deep learning. 2022.
- Pycil: A python toolbox for class-incremental learning, 2021.
- Class-incremental learning via dual augmentation. NeurIPS, 34:14306–14318, 2021.
- Self-sustaining representation expansion for non-exemplar class-incremental learning. In CVPR, pages 9286–9295, 2022.