Continual Learning: Forget-free Winning Subnetworks for Video Representations (2312.11973v6)
Abstract: Inspired by the Lottery Ticket Hypothesis (LTH), which highlights the existence of efficient subnetworks within larger, dense networks, a high-performing Winning Subnetwork (WSN) in terms of task performance under appropriate sparsity conditions is considered for various continual learning tasks. It leverages pre-existing weights from dense networks to achieve efficient learning in Task Incremental Learning (TIL) and Task-agnostic Incremental Learning (TaIL) scenarios. In Few-Shot Class Incremental Learning (FSCIL), a variation of WSN referred to as the Soft subnetwork (SoftNet) is designed to prevent overfitting when the data samples are scarce. Furthermore, the sparse reuse of WSN weights is considered for Video Incremental Learning (VIL). The use of Fourier Subneural Operator (FSO) within WSN is considered. It enables compact encoding of videos and identifies reusable subnetworks across varying bandwidths. We have integrated FSO into different architectural frameworks for continual learning, including VIL, TIL, and FSCIL. Our comprehensive experiments demonstrate FSO's effectiveness, significantly improving task performance at various convolutional representational levels. Specifically, FSO enhances higher-layer performance in TIL and FSCIL and lower-layer performance in VIL.
- Subspace regularizers for few-shot class incremental learning. arXiv preprint arXiv:2110.07059, 2021.
- Online continual learning with maximal interfered retrieval. In Advances in Neural Information Processing Systems (NeurIPS), 2019.
- Estimating or propagating gradients through stochastic neurons for conditional computation. CoRR, 2013.
- Continual learning in low-rank orthogonal subspaces. In Advances in Neural Information Processing Systems (NeurIPS), 2020.
- Efficient lifelong learning with a-gem. In Proceedings of the International Conference on Learning Representations (ICLR), 2019.
- Continual learning with tiny episodic memories. arXiv preprint arXiv:1902.10486, 2019.
- Continual predictive learning from videos. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10728–10737, 2022.
- Cnerv: Content-adaptive neural representation for visual data. arXiv preprint arXiv:2211.10421, 2022.
- Hnerv: A hybrid neural representation for videos. arXiv preprint arXiv:2304.02633, 2023.
- Nerv: Neural representations for videos. Advances in Neural Information Processing Systems, 34:21557–21568, 2021.
- Incremental few-shot learning via vector quantization in deep embedded space. In International Conference on Learning Representations, 2020.
- Long live the lottery: The existence of winning tickets in lifelong learning. In Proceedings of the International Conference on Learning Representations (ICLR), 2021.
- Learning implicit fields for generative shape modeling. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5939–5948, 2019.
- Semantic-aware knowledge distillation for few-shot class-incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2534–2543, 2021.
- Metafscil: A meta-learning approach for few-shot class incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 14166–14175, 2022.
- Streamable neural fields. In European Conference on Computer Vision, pages 595–612. Springer, 2022.
- Flattening sharpness for dynamic gradient projection memory benefits continual learning. In Advances in Neural Information Processing Systems (NeurIPS), 2021.
- Predicting parameters in deep learning. In Advances in Neural Information Processing Systems (NeurIPS), 2013.
- Dytox: Transformers for continual learning with dynamic token expansion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9285–9295, 2022.
- The lottery ticket hypothesis: Finding sparse, trainable neural networks. In Proceedings of the International Conference on Learning Representations (ICLR), 2019.
- Continual learning via neural pruning. arXiv preprint arXiv:1903.04476, 2019.
- La-maml: Look-ahead meta learning for continual learning. In Advances in Neural Information Processing Systems (NeurIPS), 2020.
- Learning both weights and connections for efficient neural network. In Proceedings of the International Conference on Learning Representations (ICLR), 2016.
- Neuroscience-inspired artificial intelligence. Neuron, 95(2):245–258, 2017.
- Towards scalable neural representation for diverse videos. arXiv preprint arXiv:2303.14124, 2023.
- Deep residual learning for image recognition, 2015.
- Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
- Constrained few-shot class-incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9057–9067, 2022.
- Geoffrey Hinton. Neural networks for machine learning, 2012.
- Learning a unified classifier incrementally via rebalancing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 831–839, 2019.
- Continual learning with node-importance based adaptive group sparse regularization. In Advances in Neural Information Processing Systems (NeurIPS), 2020.
- Decoupling representation and classifier for long-tailed recognition. arXiv preprint arXiv:1910.09217, 2019.
- Forget-free continual learning with winning subnetworks. In International Conference on Machine Learning, pages 10734–10750. PMLR, 2022.
- On the soft-subnetwork for few-shot class incremental learning. arXiv preprint arXiv:2209.07529, 2022.
- Introducing language guidance in prompt-based continual learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 11463–11473, 2023.
- Warping the space: Weight space rotation for class-incremental few-shot learning. In The Eleventh International Conference on Learning Representations, 2023.
- Overcoming catastrophic forgetting in neural networks. 2017.
- Neural operator: Learning maps between function spaces. arXiv preprint arXiv:2108.08481, 2021.
- Learning multiple layers of features from tiny images. 2009.
- Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 25:1097–1105, 2012.
- Learning task grouping and overlap in multi-task learning. In Proceedings of the International Conference on Machine Learning (ICML), 2012.
- Yann LeCun. The mnist database of handwritten digits. 1998.
- Pruning filters for efficient convnets. arXiv preprint arXiv:1608.08710, 2016.
- Learn to grow: A continual structure learning framework for overcoming catastrophic forgetting. In Proceedings of the International Conference on Machine Learning (ICML), 2019.
- Learning without forgetting. In Proceedings of the European Conference on Computer Vision (ECCV), 2016.
- E-nerv: Expedite neural video representation with disentangled spatial-temporal context. In Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXXV, pages 267–284. Springer, 2022.
- Fourier neural operator for parametric partial differential equations. arXiv preprint arXiv:2010.08895, 2020.
- Neural operator: Graph kernel network for partial differential equations. arXiv preprint arXiv:2003.03485, 2020.
- Few-shot class-incremental learning via entropy-regularized data-free replay. arXiv preprint arXiv:2207.11213, 2022.
- Sgdr: Stochastic gradient descent with warm restarts. arXiv preprint arXiv:1608.03983, 2016.
- Nirvana: Neural implicit representations of videos with adaptive networks and autoregressive patch-wise modeling. arXiv preprint arXiv:2212.14593, 2022.
- Piggyback: Adapting a single network to multiple tasks by learning to mask weights. In Proceedings of the European Conference on Computer Vision (ECCV), 2018.
- Packnet: Adding multiple tasks to a single network by iterative pruning. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pages 7765–7773, 2018.
- Few-shot lifelong learning. arXiv preprint arXiv:2103.00991, 2021.
- Catastrophic interference in connectionist networks: The sequential learning problem. In Psychology of learning and motivation, volume 24, pages 109–165. Elsevier, 1989.
- Modulated periodic activations for generalizable local functional representations. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 14214–14223, 2021.
- Distance-based image classification: Generalizing to new classes at near-zero cost. IEEE transactions on pattern analysis and machine intelligence, 35(11):2624–2637, 2013.
- Nerf: Representing scenes as neural radiance fields for view synthesis. Communications of the ACM, 65(1):99–106, 2021.
- Linear mode connectivity in multitask and continual learning. In Proceedings of the International Conference on Learning Representations (ICLR), 2021.
- Deepsdf: Learning continuous signed distance functions for shape representation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 165–174, 2019.
- Space-time prompting for video class-incremental learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 11932–11942, October 2023.
- Few-shot class-incremental learning from an open-set perspective. In European Conference on Computer Vision, pages 382–397. Springer, 2022.
- What’s hidden in a randomly weighted neural network? In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
- icarl: Incremental classifier and representation learning. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pages 2001–2010, 2017.
- Learning to learn without forgetting by maximizing transfer and minimizing interference. arXiv preprint arXiv:1810.11910, 2018.
- Progressive neural networks. arXiv preprint arXiv:1606.04671, 2016.
- Gradient projection memory for continual learning. In Proceedings of the International Conference on Learning Representations (ICLR), 2021.
- Error sensitivity modulation based experience replay: Mitigating abrupt representation drift in continual learning. In The Eleventh International Conference on Learning Representations, 2023.
- Graf: Generative radiance fields for 3d-aware image synthesis. Advances in Neural Information Processing Systems, 33:20154–20166, 2020.
- Overcoming catastrophic forgetting with hard attention to the task. In Proceedings of the International Conference on Machine Learning (ICML), 2018.
- Overcoming catastrophic forgetting in incremental few-shot learning by finding flat minima. Advances in Neural Information Processing Systems, 34, 2021.
- Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1874–1883, 2016.
- Continual learning with deep generative replay. In Advances in Neural Information Processing Systems (NeurIPS), 2017.
- Calibrating cnns for lifelong learning. Advances in Neural Information Processing Systems, 33:15579–15590, 2020.
- Implicit neural representations with periodic activation functions. Advances in Neural Information Processing Systems, 33:7462–7473, 2020.
- Construct-vl: Data-free continual structured vl concepts learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 14994–15004, June 2023.
- Coda-prompt: Continual decomposed attention-based prompting for rehearsal-free continual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 11909–11919, June 2023.
- Stanford. Available online at http://cs231n.stanford.edu/tiny-imagenet-200.zip. CS 231N, 2021.
- Decoupling learning and remembering: A bilevel memory framework with knowledge projection for task-incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 20186–20195, 2023.
- Fourier features let networks learn high frequency functions in low dimensional domains. Advances in Neural Information Processing Systems, 33:7537–7547, 2020.
- Few-shot class-incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12183–12192, 2020.
- Sebastian Thrun. A Lifelong Learning Perspective for Mobile Robot Control. Elsevier, 1995.
- Functional regularisation for continual learning with gaussian processes. In Proceedings of the International Conference on Learning Representations (ICLR), 2020.
- Factorized fourier neural operators. arXiv preprint arXiv:2111.13802, 2021.
- Pivot: Prompting for video continual learning. arXiv preprint arXiv:2212.04842, 2022.
- S-prompts learning with pre-trained transformers: An occam’s razor for domain incremental learning. Advances in Neural Information Processing Systems, 35:5682–5695, 2022.
- Dualprompt: Complementary prompting for rehearsal-free continual learning. In European Conference on Computer Vision, pages 631–648. Springer, 2022.
- Learning to prompt for continual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 139–149, 2022.
- Supermasks in superposition. In Advances in Neural Information Processing Systems (NeurIPS), 2020.
- Large scale incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 374–382, 2019.
- Ju Xu and Zhanxing Zhu. Reinforced continual learning. In Advances in Neural Information Processing Systems (NeurIPS), 2018.
- Der: Dynamically expandable representation for class incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3014–3023, 2021.
- Scalable and order-robust continual learning with additive parameter decomposition. In Proceedings of the International Conference on Learning Representations (ICLR), 2020.
- Online coreset selection for rehearsal-based continual learning. In Proceedings of the International Conference on Learning Representations (ICLR), 2022.
- Lifelong learning with dynamically expandable networks. In Proceedings of the International Conference on Learning Representations (ICLR), 2018.
- Continual learning through synaptic intelligence. In International Conference on Machine Learning, pages 3987–3995. PMLR, 2017.
- Few-shot incremental learning with continually evolved classifiers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12455–12464, 2021.
- Forward compatible few-shot class-incremental learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9046–9056, 2022.
- Few-shot class-incremental learning by sampling multi-phase tasks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022.
- Self-promoted prototype refinement for few-shot class-incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6801–6810, 2021.