Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A Comprehensive Survey of Continual Learning: Theory, Method and Application (2302.00487v3)

Published 31 Jan 2023 in cs.LG, cs.AI, and cs.CV

Abstract: To cope with real-world dynamics, an intelligent system needs to incrementally acquire, update, accumulate, and exploit knowledge throughout its lifetime. This ability, known as continual learning, provides a foundation for AI systems to develop themselves adaptively. In a general sense, continual learning is explicitly limited by catastrophic forgetting, where learning a new task usually results in a dramatic performance degradation of the old tasks. Beyond this, increasingly numerous advances have emerged in recent years that largely extend the understanding and application of continual learning. The growing and widespread interest in this direction demonstrates its realistic significance as well as complexity. In this work, we present a comprehensive survey of continual learning, seeking to bridge the basic settings, theoretical foundations, representative methods, and practical applications. Based on existing theoretical and empirical results, we summarize the general objectives of continual learning as ensuring a proper stability-plasticity trade-off and an adequate intra/inter-task generalizability in the context of resource efficiency. Then we provide a state-of-the-art and elaborated taxonomy, extensively analyzing how representative methods address continual learning, and how they are adapted to particular challenges in realistic applications. Through an in-depth discussion of promising directions, we believe that such a holistic perspective can greatly facilitate subsequent exploration in this field and beyond.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (528)
  1. Conditional channel gated networks for task-aware continual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3931–3940, 2020.
  2. Iirc: Incremental implicitly-refined classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11038–11047, 2021.
  3. Wickliffe C Abraham. Metaplasticity: tuning synapses and networks for plasticity. Nature Reviews Neuroscience, 9(5):387–387, 2008.
  4. Metaplasticity: the plasticity of synaptic plasticity. Trends in Neurosciences, 19(4):126–130, 1996.
  5. Life-long disentangled representation learning with cross-domain latent homologies. Advances in Neural Information Processing Systems, 31, 2018.
  6. Gp-tree: A gaussian process classifier for few-shot incremental learning. In International Conference on Machine Learning, pages 54–65. PMLR, 2021.
  7. Continual learning with adaptive weights (claw). In International Conference on Learning Representations, 2019.
  8. Semantics-driven generative replay for few-shot class incremental learning. In Proceedings of the ACM International Conference on Multimedia, pages 5246–5254, 2022.
  9. Uncertainty-based continual learning with adaptive regularization. Advances in Neural Information Processing Systems, 32, 2019.
  10. Ss-il: Separated softmax for incremental learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 844–853, 2021.
  11. Subspace regularizers for few-shot class incremental learning. In International Conference on Learning Representations, 2021.
  12. Memory aware synapses: Learning what (not) to forget. In Proceedings of the European Conference on Computer Vision, pages 139–154, 2018.
  13. Online continual learning with maximal interfered retrieval. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, editors, Advances in Neural Information Processing Systems, pages 11849–11860. Curran Associates, Inc., 2019.
  14. Expert gate: Lifelong learning with a network of experts. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 3366–3375, 2017.
  15. Task-free continual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11254–11263, 2019.
  16. Gradient based sample selection for online continual learning. Advances in Neural Information Processing Systems, 32, 2019.
  17. Learning fast, learning slow: A general continual learning method based on complementary learning system. In International Conference on Learning Representations, 2021.
  18. Varigrow: Variational architecture growing for task-agnostic continual learning based on bayesian novelty. In International Conference on Machine Learning, pages 865–877. PMLR, 2022.
  19. Activity-dependent gating of lateral inhibition in the mouse olfactory bulb. Nature Neuroscience, 11(1):80–87, 2008.
  20. A closer look at memorization in deep networks. In International Conference on Machine Learning, pages 233–242. PMLR, 2017.
  21. Class-incremental learning with cross-space clustering and controlled transfer. In European Conference on Computer Vision, pages 105–122. Springer, 2022.
  22. Dopaminergic neurons write and update memories with cell-type-specific rules. Elife, 5:e16135, 2016.
  23. Mushroom body output neurons encode valence and guide memory-based action selection in drosophila. Elife, 3:e04580, 2014.
  24. Few-shot continual active learning by a robot. arXiv preprint arXiv:2210.04137, 2022.
  25. Eec: Learning to encode and regenerate images for continual learning. In International Conference on Learning Representations, 2020.
  26. Structural components of synaptic plasticity and memory consolidation. Cold Spring Harbor Perspectives in Biology, 7(7):a021758, 2015.
  27. Rainbow memory: Continual learning with a memory of diverse samples. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8218–8227, 2021.
  28. Online continual learning on a contaminated data stream with blurry task boundaries. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9275–9284, 2022.
  29. Analytic-dpm: an analytic estimate of the optimal reverse variance in diffusion probabilistic models. In International Conference on Learning Representations, 2021.
  30. Learning to continually learn. In ECAI 2020, pages 992–1001. IOS Press, 2020.
  31. Il2m: Class incremental learning with dual memory. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 583–592, 2019.
  32. Scail: Classifier weights scaling for class incremental learning. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 1266–1275, 2020.
  33. Generalisation guarantees for continual learning with orthogonal gradient descent. arXiv preprint arXiv:2006.11942, 2020.
  34. Frederik Benzing. Unifying importance based regularisation methods for continual learning. In International Conference on Artificial Intelligence and Statistics, pages 2372–2396. PMLR, 2022.
  35. Comps: Continual meta policy search. In International Conference on Learning Representations, 2021.
  36. Task-aware information routing from common representation space in lifelong learning. In International Conference on Learning Representations, 2023.
  37. Doodle it yourself: Class incremental learning by drawing a few sketches. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2293–2302, 2022.
  38. Continual lifelong learning in natural language processing: A survey. In Proceedings of the 28th International Conference on Computational Linguistics, pages 6523–6541, 2020.
  39. On the opportunities and risks of foundation models. arXiv preprint arXiv:2108.07258, 2021.
  40. On the effectiveness of lipschitz-driven rehearsal in continual learning. arXiv preprint arXiv:2210.06443, 2022.
  41. Coresets via bilevel optimization for continual learning and streaming. Advances in Neural Information Processing Systems, 33:14879–14890, 2020.
  42. Class-incremental continual learning into the extended der-verse. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022.
  43. Transfer without forgetting. arXiv preprint arXiv:2206.00388, 2022.
  44. Continual semi-supervised learning through contrastive interpolation consistency. Pattern Recognition Letters, 162:9–14, 2022.
  45. Machine unlearning. In 2021 IEEE Symposium on Security and Privacy (SP), pages 141–159. IEEE, 2021.
  46. Language models are few-shot learners. Advances in Neural Information Processing Systems, 33:1877–1901, 2020.
  47. Dark experience for general continual learning: a strong, simple baseline. Advances in Neural Information Processing Systems, 33:15920–15930, 2020.
  48. New insights on reducing abrupt representation change in online continual learning. In International Conference on Learning Representations, 2021.
  49. Online learned continual compression with adaptive quantization modules. In International Conference on Machine Learning, pages 1240–1250. PMLR, 2020.
  50. Online fast adaptation and knowledge accumulation (osaka): a new approach to continual learning. Advances in Neural Information Processing Systems, 33:16532–16545, 2020.
  51. Online continual learning with natural distribution shifts: An empirical study with visual data. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 8281–8290, 2021.
  52. Continual learning for neural machine translation. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 3964–3974, 2021.
  53. Hippocampal replay in the awake state: a potential substrate for memory consolidation and retrieval. Nature neuroscience, 14(2):147–153, 2011.
  54. End-to-end incremental learning. In Proceedings of the European Conference on Computer Vision, pages 233–248, 2018.
  55. Re-evaluating circuit mechanisms underlying pattern separation. Neuron, 101(4):584–602, 2019.
  56. Incremental learning in semantic segmentation from image labels. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4371–4381, 2022.
  57. Modeling the background for incremental learning in semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9233–9242, 2020.
  58. Co2l: Contrastive continual learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 9516–9525, 2021.
  59. Swad: Domain generalization by seeking flat minima. Advances in Neural Information Processing Systems, 34:22405–22418, 2021.
  60. Task-balanced batch normalization for exemplar-based class-incremental learning. arXiv preprint arXiv:2201.12559, 2022.
  61. Cpr: Classifier-projection regularization for continual learning. In International Conference on Learning Representations, 2020.
  62. Ssul: Semantic segmentation with unknown label for exemplar-based class-incremental learning. Advances in Neural Information Processing Systems, 34:10919–10930, 2021.
  63. Active bias: Training more accurate neural networks by emphasizing high variance samples. Advances in Neural Information Processing Systems, 30, 2017.
  64. Riemannian walk for incremental learning: Understanding forgetting and intransigence. In Proceedings of the European Conference on Computer Vision, pages 532–547, 2018.
  65. Using hindsight to anchor past knowledge in continual learning. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 6993–7001, 2021.
  66. Continual learning in low-rank orthogonal subspaces. Advances in Neural Information Processing Systems, 33:9900–9911, 2020.
  67. Efficient lifelong learning with a-gem. In International Conference on Learning Representations, 2018.
  68. On tiny episodic memories in continual learning. arXiv preprint arXiv:1902.10486, 2019.
  69. Class gradient projection for continual learning. In Proceedings of the ACM International Conference on Multimedia, pages 5575–5583, 2022.
  70. Mitigating forgetting in online continual learning via instance-aware parameterization. Advances in Neural Information Processing Systems, 33:17466–17477, 2020.
  71. Few-shot incremental learning for label-to-image translation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3697–3707, 2022.
  72. Lifelong machine learning. Synthesis Lectures on Artificial Intelligence and Machine Learning, 12(3):1–207, 2018.
  73. Semantic-aware knowledge distillation for few-shot class-incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2534–2543, 2021.
  74. Synthesized feature based few-shot class-incremental learning on a mixture of subspaces. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 8661–8670, 2021.
  75. Metafscil: A meta-learning approach for few-shot class incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 14166–14175, 2022.
  76. Ron Chrisley. Embodied artificial intelligence. Artificial intelligence, 149(1):131–150, 2003.
  77. Online continual learning from imbalanced data. In International Conference on Machine Learning, pages 1952–1961. PMLR, 2020.
  78. Continual learning for affective robotics: Why, what and how? In IEEE International Conference on Robot and Human Interactive Communication, pages 425–431. IEEE, 2020.
  79. Ongoing in vivo experience triggers synaptic metaplasticity in the neocortex. Science, 319(5859):101–104, 2008.
  80. Coordinated and compartmentalized neuromodulation shapes sensory processing in drosophila. Cell, 163(7):1742–1755, 2015.
  81. Routing networks with co-training for continual learning. arXiv preprint arXiv:2009.04381, 2020.
  82. Gan memory with no forgetting. Advances in Neural Information Processing Systems, 33:16481–16494, 2020.
  83. Continual pre-training mitigates forgetting in language and vision. arXiv preprint arXiv:2205.09357, 2022.
  84. Uncertainty-based out-of-distribution detection requires suitable function space priors. arXiv preprint arXiv:2110.06020, 2021.
  85. Probing representation forgetting in supervised and unsupervised continual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16712–16721, 2022.
  86. Hippocampal replay of extended experience. Neuron, 63(4):497–507, 2009.
  87. A continual learning survey: Defying forgetting in classification tasks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(7):3366–3385, 2021.
  88. Continual prototype evolution: Learning online from non-stationary data streams. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 8250–8259, 2021.
  89. Episodic memory in lifelong language learning. Advances in Neural Information Processing Systems, 32, 2019.
  90. Ratt: Recurrent attention to transient tasks for continual image captioning. Advances in Neural Information Processing Systems, 33:16736–16748, 2020.
  91. Flattening sharpness for dynamic gradient projection memory benefits continual learning. Advances in Neural Information Processing Systems, 34:18710–18721, 2021.
  92. Kernel continual learning. In International Conference on Machine Learning, pages 2621–2631. PMLR, 2021.
  93. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.
  94. Learning without memorizing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5138–5146, 2019.
  95. Diffusion models beat gans on image synthesis. Advances in Neural Information Processing Systems, 34:8780–8794, 2021.
  96. A theoretical analysis of catastrophic forgetting through the ntk overlap matrix. In International Conference on Artificial Intelligence and Statistics, pages 1072–1080. PMLR, 2021.
  97. Efficient continual learning ensembles in neural network subspaces. arXiv preprint arXiv:2202.09826, 2022.
  98. Federated class-incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10164–10173, 2022.
  99. Bridging non co-occurrence with unlabeled in-the-wild data for incremental object detection. Advances in Neural Information Processing Systems, 34:30492–30503, 2021.
  100. Few-shot class-incremental learning via relation knowledge distillation. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 1255–1263, 2021.
  101. Inability to activate rac1-dependent forgetting contributes to behavioral inflexibility in mutants of multiple autism-risk genes. Proceedings of the National Academy of Sciences, 113(27):7644–7649, 2016.
  102. Plop: Learning without forgetting for continual semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4040–4050, 2021.
  103. Podnet: Pooled outputs distillation for small-tasks incremental learning. In European Conference on Computer Vision, pages 86–102. Springer, 2020.
  104. Dytox: Transformers for continual learning with dynamic token expansion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9285–9295, 2022.
  105. Kenji Doya. What are the computations of the cerebellum, the basal ganglia and the cerebral cortex? Neural Networks, 12(7-8):961–974, 1999.
  106. Kenji Doya. Complementary roles of basal ganglia and cerebellum in learning and motor control. Current opinion in neurobiology, 10(6):732–739, 2000.
  107. Efficient perturbation inference and expandable network for continual learning. Neural Networks, 2022.
  108. A survey of embodied ai: From simulators to research tasks. IEEE Transactions on Emerging Topics in Computational Intelligence, 2022.
  109. Uncertainty-guided continual learning with bayesian neural networks. In International Conference on Learning Representations, 2019.
  110. Adversarial continual learning. In European Conference on Computer Vision, pages 386–402. Springer, 2020.
  111. Remembering for the right reasons: Explanations reduce catastrophic forgetting. In International Conference on Learning Representations, 2020.
  112. The turking test: Can language models understand instructions? arXiv preprint arXiv:2010.11982, 2020.
  113. Boovae: Boosting approach for continual learning of vae. Advances in Neural Information Processing Systems, 34:17889–17901, 2021.
  114. Memory efficient continual learning with transformers. 2022.
  115. Parvalbumin+ interneurons obey unique connectivity rules and establish a powerful lateral-inhibition microcircuit in dentate gyrus. Nature Communications, 9(1):1–10, 2018.
  116. Orthogonal gradient descent for continual learning. In International Conference on Artificial Intelligence and Statistics, pages 3762–3773. PMLR, 2020.
  117. Overcoming catastrophic forgetting in incremental object detection via elastic response distillation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9427–9436, 2022.
  118. Pathnet: Evolution channels gradient descent in super neural networks. arXiv preprint arXiv:1701.08734, 2017.
  119. Self-supervised models are continual learners. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9621–9630, 2022.
  120. Online continual learning under extreme memory constraints. In European Conference on Computer Vision, pages 720–735. Springer, 2020.
  121. Model-agnostic meta-learning for fast adaptation of deep networks. In International Conference on Machine Learning, pages 1126–1135. PMLR, 2017.
  122. Orthogonal representations for robust context-dependent task performance in brains and neural networks. Neuron, 110(7):1258–1270, 2022.
  123. Sharpness-aware minimization for efficiently improving generalization. In International Conference on Learning Representations, 2020.
  124. Continual semantic segmentation leveraging image-level labels and rehearsal.
  125. The organization of recent and remote memories. Nature Reviews Neuroscience, 6(2):119–130, 2005.
  126. Stan Franklin. Autonomous agents as embodied ai. Cybernetics & Systems, 28(6):499–520, 1997.
  127. Continuous scene representations for embodied ai. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 14849–14859, 2022.
  128. Self-supervised training enhances online continual learning. arXiv preprint arXiv:2103.14010, 2021.
  129. Incremental few-shot instance segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1185–1194, 2021.
  130. R-dfcil: Relation-guided representation learning for data-free class incremental learning. arXiv preprint arXiv:2203.13104, 2022.
  131. Towards continual learning for multilingual machine translation via vocabulary substitution. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1184–1192, 2021.
  132. Loss surfaces, mode connectivity, and fast ensembling of dnns. Advances in Neural Information Processing Systems, 31, 2018.
  133. Dynamic dialogue policy for continual reinforcement learning. In Proceedings of the 29th International Conference on Computational Linguistics, pages 266–284, 2022.
  134. Continual learning for task-oriented dialogue system with iterative network pruning, expanding and masking. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), pages 517–523, 2021.
  135. Subhankar Ghosh. Dynamic vaes with generative replay for continual zero-shot learning. arXiv preprint arXiv:2104.12468, 2021.
  136. Ross Girshick. Fast r-cnn. In Proceedings of the IEEE International Conference on Computer Vision, pages 1440–1448, 2015.
  137. Mixed-privacy forgetting in deep networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 792–801, 2021.
  138. Eternal sunshine of the spotless net: Selective forgetting in deep networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9304–9312, 2020.
  139. Continual learning via neural pruning. arXiv preprint arXiv:1903.04476, 2019.
  140. Continual pre-training of language models for math problem understanding with syntax-aware memory network. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 5923–5933, 2022.
  141. Knowledge capture and replay for continual learning. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 10–18, 2022.
  142. Knowledge distillation: A survey. International Journal of Computer Vision, 129(6):1789–1819, 2021.
  143. Psycholinguistics meets continual learning: Measuring catastrophic forgetting in visual question answering. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 3601–3605, 2019.
  144. Investigating catastrophic forgetting during continual training for neural machine translation. In Proceedings of the 28th International Conference on Computational Linguistics, pages 4315–4326, 2020.
  145. Class-incremental instance segmentation via multi-teacher networks. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 1478–1486, 2021.
  146. Not just selection, but exploration: Online class-incremental continual learning via dual view consistency. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7442–7451, 2022.
  147. Adaptive orthogonal projection for batch and online continual learning. 2, 2022.
  148. Online continual learning through mutual information maximization. In International Conference on Machine Learning, pages 8109–8126. PMLR, 2022.
  149. Look-ahead meta learning for continual learning. Advances in Neural Information Processing Systems, 33:11588–11598, 2020.
  150. Nispa: Neuro-inspired stability-plasticity adaptation for continual learning in sparse networks. In International Conference on Machine Learning, pages 8157–8174. PMLR, 2022.
  151. Embracing change: Continual learning in deep neural networks. Trends in Cognitive Sciences, 24(12):1028–1040, 2020.
  152. Econet: Effective continual pretraining of language models for event temporal reasoning. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 5367–5380, 2021.
  153. Pre-trained models: Past, present and future. AI Open, 2:225–250, 2021.
  154. Visualizing and understanding the effectiveness of bert. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 4143–4152, 2019.
  155. An end-to-end architecture for class-incremental object detection with knowledge distillation. In IEEE International Conference on Multimedia and Expo, pages 1–6. IEEE, 2019.
  156. The elements of statistical learning: data mining, inference, and prediction, volume 2. Springer, 2009.
  157. Labelling and optical erasure of synaptic memory traces in the motor cortex. Nature, 525(7569):333–338, 2015.
  158. Remind your neural network to prevent catastrophic forgetting. In European Conference on Computer Vision, pages 466–483. Springer, 2020.
  159. Replay in deep learning: Current approaches and missing biological elements. Neural Computation, 33(11):2908–2950, 2021.
  160. Exemplar-supported generative reproduction for class incremental learning. In BMVC, page 98, 2018.
  161. Incremental learning in online scenario. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13926–13935, 2020.
  162. Online continual learning via candidates voting. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 3154–3163, 2022.
  163. Posterior meta-replay for continual learning. Advances in Neural Information Processing Systems, 34:14135–14149, 2021.
  164. Constrained few-shot class-incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9057–9067, 2022.
  165. Flat minima. Neural Computation, 9(1):1–42, 1997.
  166. Lifelong learning via progressive distillation and retrospection. In Proceedings of the European Conference on Computer Vision, pages 437–452, 2018.
  167. Learning a unified classifier incrementally via rebalancing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 831–839, 2019.
  168. Parameter-efficient transfer learning for nlp. In International Conference on Machine Learning, pages 2790–2799. PMLR, 2019.
  169. Re-evaluating continual learning scenarios: A categorization and case for strong baselines. arXiv preprint arXiv:1810.12488, 2018.
  170. How well does self-supervised pre-training perform with streaming data? In International Conference on Learning Representations, 2021.
  171. One pass imagenet. arXiv preprint arXiv:2111.01956, 2021.
  172. Overcoming catastrophic forgetting for continual learning via model adaptation. In International Conference on Learning Representations, 2019.
  173. Continual learning by using information of each class holistically. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 7797–7805, 2021.
  174. Distilling causal effect of data in class-incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3957–3966, 2021.
  175. Continual learning for text classification with information disentanglement based regularization. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 2736–2746, 2021.
  176. Altersgd: Finding flat minima for continual learning by alternative training. arXiv preprint arXiv:2107.05804, 2021.
  177. Compacting, picking and growing for unforgetting continual learning. Advances in Neural Information Processing Systems, 32, 2019.
  178. Optimizing reusable knowledge for continual learning via metalearning. Advances in Neural Information Processing Systems, 34:14150–14162, 2021.
  179. Ferenc Huszár. On quadratic penalties in elastic weight consolidation. arXiv preprint arXiv:1712.03847, 2017.
  180. Incremental task learning with incremental rank updates. In European Conference on Computer Vision, pages 566–582. Springer, 2022.
  181. Origins of cell-type-specific olfactory processing in the drosophila mushroom body circuit. Neuron, 95(2):357–367, 2017.
  182. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International Conference on Machine Learning, pages 448–456. PMLR, 2015.
  183. Memory-efficient incremental learning through feature adaptation. In European Conference on Computer Vision, pages 699–715. Springer, 2020.
  184. Selective experience replay for lifelong learning. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 32, 2018.
  185. Averaging weights leads to wider optima and better generalization. arXiv preprint arXiv:1803.05407, 2018.
  186. Adaptive mixtures of local experts. Neural Computation, 3(1):79–87, 1991.
  187. Towards continual knowledge learning of language models. In International Conference on Learning Representations, 2021.
  188. A simple baseline that questions the use of pretrained-models in continual learning. In NeurIPS 2022 Workshop on Distribution Shifts: Connecting Methods and Applications.
  189. Meta-learning representations for continual learning. Advances in Neural Information Processing Systems, 32, 2019.
  190. Contributions by metaplasticity to solving the catastrophic forgetting problem. Trends in Neurosciences, 2022.
  191. Reconciling meta-learning and continual learning with online mixtures of tasks. Advances in Neural Information Processing Systems, 32, 2019.
  192. Fedspeech: Federated text-to-speech with continual learning. arXiv preprint arXiv:2110.07216, 2021.
  193. Helpful or harmful: Inter-task association in continual learning. In European Conference on Computer Vision, pages 519–535. Springer, 2022.
  194. Learn continually, generalize rapidly: Lifelong knowledge accumulation for few-shot learning. In Findings of the Association for Computational Linguistics: EMNLP 2021, pages 714–729, 2021.
  195. Gradient-based editing of memory examples for online task-free continual learning. Advances in Neural Information Processing Systems, 34:29193–29205, 2021.
  196. Energy-based latent aligner for incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7452–7461, 2022.
  197. Towards open world object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5830–5840, 2021.
  198. Novel class discovery without forgetting. In European Conference on Computer Vision, pages 570–586. Springer, 2022.
  199. Continual learning with node-importance based adaptive group sparse regularization. Advances in Neural Information Processing Systems, 33:3647–3658, 2020.
  200. S3c: Self-supervised stochastic classifiers for few-shot class-incremental learning. In European Conference on Computer Vision, pages 432–448. Springer, 2022.
  201. Reparameterizing convolutions for incremental multi-task learning without task interference. In European Conference on Computer Vision, pages 689–707. Springer, 2020.
  202. Forget-free continual learning with winning subnetworks. In International Conference on Machine Learning, pages 10734–10750. PMLR, 2022.
  203. Class-incremental learning by knowledge distillation with adaptive feature consolidation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16071–16080, 2022.
  204. A soft nearest-neighbor framework for continual semi-supervised learning. arXiv preprint arXiv:2212.05102, 2022.
  205. Natural continual learning: success is a journey, not (just) a destination. Advances in Neural Information Processing Systems, 34:28067–28079, 2021.
  206. Continual reinforcement learning with multi-timescale replay. arXiv preprint arXiv:2004.07530, 2020.
  207. Continual reinforcement learning with complex synapses. In International Conference on Machine Learning, pages 2497–2506. PMLR, 2018.
  208. Policy consolidation for continual reinforcement learning. In International Conference on Machine Learning, pages 3242–3251. PMLR, 2019.
  209. Variational auto-regressive gaussian processes for continual learning. In International Conference on Machine Learning, pages 5290–5300. PMLR, 2021.
  210. Learning curves for continual learning in neural networks: Self-knowledge transfer and forgetting. In International Conference on Learning Representations, 2022.
  211. Continual training of language models for few-shot learning. arXiv preprint arXiv:2210.05549, 2022.
  212. Continual learning of natural language processing tasks: A survey. arXiv preprint arXiv:2211.12701, 2022.
  213. Continual learning of a mixed sequence of similar and dissimilar tasks. Advances in Neural Information Processing Systems, 33:18493–18504, 2020.
  214. Achieving forgetting prevention and knowledge transfer in continual learning. Advances in Neural Information Processing Systems, 34:22443–22456, 2021.
  215. Adapting bert for continual learning of a sequence of aspect sentiment classification tasks. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 4746–4755, 2021.
  216. Fearnet: Brain-inspired model for incremental learning. In International Conference on Learning Representations, 2018.
  217. On large-batch training for deep learning: Generalization gap and sharp minima. In International Conference on Learning Representations, 2017.
  218. Same state, different task: Continual reinforcement learning without interference. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 7143–7151, 2022.
  219. Towards continual reinforcement learning: A review and perspectives. Journal of Artificial Intelligence Research, 75:1401–1476, 2022.
  220. Towards label-efficient incremental learning: A survey. arXiv preprint arXiv:2302.00353, 2023.
  221. Imbalanced continual learning with partitioning reservoir sampling. In European Conference on Computer Vision, pages 411–428. Springer, 2020.
  222. Continual learning on noisy data streams via self-purified replay. In Proceedings of the IEEE/CVF international conference on computer vision, pages 537–547, 2021.
  223. A theoretical study on solving continual learning. arXiv preprint arXiv:2211.02633, 2022.
  224. Split-and-bridge: Adaptable class incremental learning within a single neural network. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 8137–8145, 2021.
  225. Dygrain: An incremental learning framework for dynamic graphs.
  226. Segment anything. arXiv preprint arXiv:2304.02643, 2023.
  227. Overcoming catastrophic forgetting in neural networks. Proceedings of the National Academy of Sciences, 114(13):3521–3526, 2017.
  228. Meta-consolidation for continual learning. Advances in Neural Information Processing Systems, 33:14374–14386, 2020.
  229. Incremental object detection via meta-learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021.
  230. Optimal continual learning has perfect memory and is np-hard. In International Conference on Machine Learning, pages 5327–5337. PMLR, 2020.
  231. Online continual learning on class incremental blurry task configuration with anytime inference. In International Conference on Learning Representations, 2021.
  232. Balancing stability and plasticity through advanced null space in continual learning. In European Conference on Computer Vision, pages 219–236. Springer, 2022.
  233. Imagenet classification with deep convolutional neural networks. In Proceedings of the Advances in Neural Information Processing Systems, volume 25, pages 1097–1105, 2012.
  234. Biological underpinnings for lifelong learning machines. Nature Machine Intelligence, 4(3):196–210, 2022.
  235. Generalized and incremental few-shot learning by explicit learning and calibration without forgetting. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 9020–9029, 2021.
  236. Determinantal point processes for machine learning. Foundations and Trends® in Machine Learning, 5(2–3):123–286, 2012.
  237. Bayesian structural adaptation for continual learning. In International Conference on Machine Learning, pages 5850–5860. PMLR, 2021.
  238. What learning systems do intelligent agents need? complementary learning systems theory updated. Trends in cognitive sciences, 20(7):512–534, 2016.
  239. Retrospective adversarial replay for continual learning.
  240. Continual learning with bayesian neural networks for non-stationary data. In International Conference on Learning Representations, 2019.
  241. Do not forget to attend to uncertainty while mitigating catastrophic forgetting. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 736–745, 2021.
  242. Semi-supervised class incremental learning. In International Conference on Pattern Recognition, pages 10383–10389. IEEE, 2021.
  243. Few-shot and continual learning with attentive independent mechanisms. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 9455–9464, 2021.
  244. Continual learning with extended kronecker-factored approximate curvature. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9001–9010, 2020.
  245. Residual continual learning. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pages 4553–4560, 2020.
  246. Overcoming catastrophic forgetting with unlabeled data in the wild. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 312–321, 2019.
  247. Infomax-gan: Improved adversarial image generation via information maximization and contrastive learning. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 3942–3952, 2021.
  248. Continual learning in the teacher-student setup: Impact of task similarity. In International Conference on Machine Learning, pages 6109–6119. PMLR, 2021.
  249. A neural dirichlet process mixture model for task-free continual learning. In International Conference on Learning Representations, 2019.
  250. Overcoming catastrophic forgetting by incremental moment matching. Advances in Neural Information Processing Systems, 30, 2017.
  251. Continual learning for robotics: Definition, framework, learning strategies, opportunities and challenges. Information Fusion, 58:52–68, 2020.
  252. The power of scale for parameter-efficient prompt tuning. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 3045–3059, 2021.
  253. Variational beam search for learning with distribution shifts. arXiv preprint arXiv:2012.08101, 2020.
  254. Overcoming catastrophic forgetting during domain adaptation of seq2seq language generation. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 5441–5454, 2022.
  255. Rilod: Near real-time incremental learning for object detection at the edge. In Proceedings of the 4th ACM/IEEE Symposium on Edge Computing, pages 113–126, 2019.
  256. Continual few-shot intent detection. In Proceedings of the 29th International Conference on Computational Linguistics, pages 333–343, 2022.
  257. Federated optimization in heterogeneous networks. Proceedings of Machine Learning and Systems, 2:429–450, 2020.
  258. Generalized focal loss: Learning qualified and distributed bounding boxes for dense object detection. Advances in Neural Information Processing Systems, 33:21002–21012, 2020.
  259. Learn to grow: A continual structure learning framework for overcoming catastrophic forgetting. In International Conference on Machine Learning, pages 3925–3934. PMLR, 2019.
  260. Learning without forgetting. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(12):2935–2947, 2017.
  261. Total recall: a customized continual learning method for neural semantic parsers. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 3816–3831, 2021.
  262. Balancing between forgetting and acquisition in incremental subpopulation learning. In European Conference on Computer Vision, pages 364–380. Springer, 2022.
  263. Towards better plasticity-stability trade-off in incremental learning: A simple linear connector. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 89–98, 2022.
  264. Prototype-guided continual adaptation for class-incremental unsupervised domain adaptation. In European Conference on Computer Vision, pages 351–368. Springer, 2022.
  265. Trgp: Trust region gradient projection for continual learning. In International Conference on Learning Representations, 2021.
  266. Beyond not-forgetting: Continual learning with backward knowledge transfer. arXiv preprint arXiv:2211.00789, 2022.
  267. Focal loss for dense object detection. In Proceedings of the IEEE International Conference on Computer Vision, pages 2980–2988, 2017.
  268. Lifelong and continual learning dialogue systems: learning during conversation. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 15058–15063, 2021.
  269. Few-shot class-incremental learning via entropy-regularized data-free replay. In European Conference on Computer Vision, pages 146–162. Springer, 2022.
  270. Continual learning with recursive gradient optimization. In International Conference on Learning Representations, 2021.
  271. Incremental prompting: Episodic memory prompt for lifelong event detection. arXiv preprint arXiv:2204.07275, 2022.
  272. Incremental meta-learning via indirect discriminant alignment. arXiv preprint arXiv:2002.04162, 2020.
  273. Lifelong intent detection via multi-strategy rebalancing. arXiv preprint arXiv:2108.04445, 2021.
  274. Continual learning for sentence representations using conceptors. In Proceedings of NAACL-HLT, pages 3274–3279, 2019.
  275. Long-tailed class incremental learning. In European Conference on Computer Vision, pages 495–512. Springer, 2022.
  276. Rotate your networks: Better weight consolidation and less catastrophic forgetting. In International Conference on Pattern Recognition, pages 2262–2268. IEEE, 2018.
  277. Generative feature replay for class-incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pages 226–227, 2020.
  278. Adaptive aggregation networks for class-incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2544–2553, 2021.
  279. Rmm: Reinforced memory management for class-incremental learning. Advances in Neural Information Processing Systems, 34:3478–3490, 2021.
  280. Mnemonics training: Multi-class incremental learning without forgetting. In Proceedings of the IEEE/CVF conference on Computer Vision and Pattern Recognition, pages 12245–12254, 2020.
  281. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 10012–10022, 2021.
  282. Continual reinforcement learning in 3d non-stationary environments. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pages 248–249, 2020.
  283. Core50: a new dataset and benchmark for continuous object recognition. In Conference on Robot Learning, pages 17–26. PMLR, 2017.
  284. Generalized variational continual learning. In International Conference on Learning Representations, 2020.
  285. Gradient episodic memory for continual learning. Advances in Neural Information Processing Systems, 30, 2017.
  286. Dpm-solver: A fast ode solver for diffusion probabilistic model sampling in around 10 steps. In Advances in Neural Information Processing Systems, 2022.
  287. Adaptive online planning for continual lifelong learning. arXiv preprint arXiv:1912.01188, 2019.
  288. Continual federated learning based on knowledge distillation.
  289. Representational continuity for unsupervised continual learning. In International Conference on Learning Representations, 2021.
  290. Continual learning in task-oriented dialogue systems. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 7452–7467, 2021.
  291. Online continual learning in image classification: An empirical survey. Neurocomputing, 469:28–51, 2022.
  292. Supervised contrastive replay: Revisiting the nearest class mean classifier in online class-incremental continual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3589–3599, 2021.
  293. Piggyback: Adapting a single network to multiple tasks by learning to mask weights. In Proceedings of the European Conference on Computer Vision, pages 67–82, 2018.
  294. Packnet: Adding multiple tasks to a single network by iterative pruning. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pages 7765–7773, 2018.
  295. Recall: Replay-based continual learning in semantic segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 7026–7035, 2021.
  296. Optimizing neural networks with kronecker-factored approximate curvature. In International Conference on Machine Learning, pages 2408–2417. PMLR, 2015.
  297. On class orderings for incremental learning. arXiv preprint arXiv:2007.02145, 2020.
  298. Few-shot lifelong learning. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 2337–2345, 2021.
  299. Why there are complementary learning systems in the hippocampus and neocortex: insights from the successes and failures of connectionist models of learning and memory. Psychological review, 102(3):419, 1995.
  300. Catastrophic interference in connectionist networks: The sequential learning problem. In Psychology of learning and motivation, volume 24, pages 109–165. Elsevier, 1989.
  301. Communication-efficient learning of deep networks from decentralized data. In Artificial Intelligence and Statistics, pages 1273–1282. PMLR, 2017.
  302. Continual learning using a bayesian nonparametric dictionary of weight factors. In International Conference on Artificial Intelligence and Statistics, pages 100–108. PMLR, 2021.
  303. An empirical investigation of the role of pre-training in lifelong learning. arXiv preprint arXiv:2112.09153, 2021.
  304. Lifelong policy gradient learning of factored policies for faster training without forgetting. Advances in Neural Information Processing Systems, 33:14398–14409, 2020.
  305. Continual learning for natural language generation in task-oriented dialog systems. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 3461–3474, 2020.
  306. Generalized class incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pages 240–241, 2020.
  307. Continual learning with filter atom swapping. In International Conference on Learning Representations, 2021.
  308. Incremental learning techniques for semantic segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, pages 0–0, 2019.
  309. Continual semantic segmentation via repulsion-attraction of sparse and disentangled latent representations. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1114–1124, 2021.
  310. Knowledge distillation for incremental learning in semantic segmentation. Computer Vision and Image Understanding, 205:103167, 2021.
  311. One-pass learning via bridging orthogonal gradient descent and recursive least-squares. arXiv preprint arXiv:2207.13853, 2022.
  312. Wide neural networks forget less catastrophically. In International Conference on Machine Learning, pages 15699–15717. PMLR, 2022.
  313. Architecture matters in continual learning. arXiv preprint arXiv:2202.00275, 2022.
  314. Dropout as an implicit gating mechanism for continual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pages 232–233, 2020.
  315. Linear mode connectivity in multitask and continual learning. In International Conference on Learning Representations, 2020.
  316. Understanding the role of training regimes in continual learning. Advances in Neural Information Processing Systems, 33:7308–7320, 2020.
  317. The drosophila mushroom body: From architecture to algorithm in a learning circuit. Annual Review of Neuroscience, 43:465–484, 2020.
  318. Continual learning for named entity recognition. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 13570–13577, 2021.
  319. Continuous coordination as a realistic scenario for lifelong learning. In International Conference on Machine Learning, pages 8016–8024. PMLR, 2021.
  320. What is being transferred in transfer learning? Advances in Neural Information Processing Systems, 33:512–523, 2020.
  321. Variational continual learning. In International Conference on Learning Representations, 2018.
  322. Alife: Adaptive logit regularizer and feature replay for incremental semantic segmentation. arXiv preprint arXiv:2210.06816, 2022.
  323. OpenAI. Gpt-4 technical report. 2023.
  324. Foundational models for continual learning: An empirical study of latent replay. arXiv preprint arXiv:2205.00329, 2022.
  325. Learning to remember: A synaptic plasticity driven framework for continual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11321–11329, 2019.
  326. Continual learning via local module composition. Advances in Neural Information Processing Systems, 34:30298–30312, 2021.
  327. Dopamine and cognitive control in prefrontal cortex. Trends in Cognitive Sciences, 23(3):213–234, 2019.
  328. Overcoming catastrophic forgetting by neuron-level plasticity control. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pages 5339–5346, 2020.
  329. Continual deep learning by functional regularisation of memorable past. Advances in Neural Information Processing Systems, 33:4453–4464, 2020.
  330. First session adaptation: A strong replay-free baseline for class-incremental learning. arXiv preprint arXiv:2303.13199, 2023.
  331. Continual lifelong learning with neural networks: A review. Neural Networks, 113:54–71, 2019.
  332. Continual learning by asymmetric loss approximation with single-side overestimation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 3335–3344, 2019.
  333. Class-incremental learning for action recognition in videos. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 13698–13707, 2021.
  334. Continual few-shot learning for text classification. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 5688–5702, 2021.
  335. Francesco Pelosin. Simpler is better: off-the-shelf continual learning through pretrained backbones. arXiv preprint arXiv:2205.01586, 2022.
  336. Continual learning: a feature extraction formalization, an efficient algorithm, and fundamental obstructions. arXiv preprint arXiv:2203.14383, 2022.
  337. Faster ilod: Incremental learning for object detectors based on faster rcnn. Pattern Recognition Letters, 140:109–115, 2020.
  338. Sid: Incremental learning for anchor-free object detection via selective and inter-related distillation. Computer Vision and Image Understanding, 210:103229, 2021.
  339. Few-shot class-incremental learning from an open-set perspective. In European Conference on Computer Vision, pages 382–397. Springer, 2022.
  340. A pac-bayesian bound for lifelong learning. In International Conference on Machine Learning, pages 991–999. PMLR, 2014.
  341. Film: Visual reasoning with a general conditioning layer. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 32, 2018.
  342. Incremental few-shot object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13846–13855, 2020.
  343. Fetril: Feature translation for exemplar-free class-incremental learning. arXiv preprint arXiv:2211.13131, 2022.
  344. Continual learning with fully probabilistic models. arXiv preprint arXiv:2104.09240, 2021.
  345. Dualnet: Continual learning, fast and slow. Advances in Neural Information Processing Systems, 34:16131–16144, 2021.
  346. Contextual transformation networks for online continual learning. In International Conference on Learning Representations, 2020.
  347. Continual normalization: Rethinking batch normalization for online continual learning. In International Conference on Learning Representations, 2021.
  348. Online task-free continual learning with dynamic sparse distributed memory. In European Conference on Computer Vision, pages 739–756. Springer, 2022.
  349. Looking back on learned experiences for class/task incremental learning. In International Conference on Learning Representations, 2021.
  350. Gdumb: A simple approach that questions our progress in continual learning. In European conference on computer vision, pages 524–540. Springer, 2020.
  351. The challenges of continuous self-supervised learning. arXiv preprint arXiv:2203.12710, 2022.
  352. Lfpt5: A unified framework for lifelong few-shot language learning based on prompt tuning of t5. In International Conference on Learning Representations, 2021.
  353. Continual few-shot relation learning via embedding space regularization and data augmentation. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2776–2789, 2022.
  354. Bns: Building network structures dynamically for continual learning. Advances in Neural Information Processing Systems, 34:20608–20620, 2021.
  355. Elle: Efficient lifelong pre-training for emerging data. arXiv preprint arXiv:2203.06311, 2022.
  356. Recent advances of continual learning in computer vision: An overview. arXiv preprint arXiv:2109.11369, 2021.
  357. Learning transferable visual models from natural language supervision. In International Conference on Machine Learning, pages 8748–8763. PMLR, 2021.
  358. Random path selection for incremental learning. Advances in Neural Information Processing Systems, 3, 2019.
  359. itaml: An incremental task-agnostic meta-learning approach. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13588–13597, 2020.
  360. Relationship matters: Relation guided knowledge transfer for incremental learning of object detectors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pages 250–251, 2020.
  361. Anatomy of catastrophic forgetting: Hidden representations and task semantics. arXiv preprint arXiv:2007.07400, 2020.
  362. Effect of scale on catastrophic forgetting in neural networks. In International Conference on Learning Representations, 2021.
  363. Model zoo: A growing brain that learns continually. In International Conference on Learning Representations, 2021.
  364. Encoder based lifelong learning. In Proceedings of the IEEE International Conference on Computer Vision, pages 1320–1328, 2017.
  365. Continual unsupervised representation learning. Advances in Neural Information Processing Systems, 32, 2019.
  366. Progressive prompts: Continual learning for language models. In International Conference on Learning Representations, 2023.
  367. icarl: Incremental classifier and representation learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2001–2010, 2017.
  368. A two-phase prototypical network model for incremental few-shot relation classification. In Proceedings of the International Conference on Computational Linguistics, pages 1618–1629, 2020.
  369. Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in Neural Information Processing Systems, 28, 2015.
  370. The persistence and transience of memory. Neuron, 94(6):1071–1084, 2017.
  371. Learning to learn without forgetting by maximizing transfer and minimizing interference. In International Conference on Learning Representations, 2018.
  372. Scalable recollections for continual lifelong learning. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, pages 1352–1359, 2019.
  373. Online structured laplace approximations for overcoming catastrophic forgetting. Advances in Neural Information Processing Systems, 31, 2018.
  374. Stream-51: Streaming classification and novelty detection from videos. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pages 228–229, 2020.
  375. Experience replay for continual learning. Advances in Neural Information Processing Systems, 32, 2019.
  376. Generative continual concept learning. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pages 5545–5552, 2020.
  377. Complementary learning for overcoming catastrophic forgetting using experience replay. In Proceedings of the 28th International Joint Conference on Artificial Intelligence, pages 3339–3345, 2019.
  378. Detection and continual learning of novel face presentation attacks. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 14851–14860, 2021.
  379. Class-incremental novel class discovery. In European Conference on Computer Vision, pages 317–333. Springer, 2022.
  380. Continual learning via sequential function-space variational inference. In International Conference on Machine Learning, pages 18871–18887. PMLR, 2022.
  381. Progressive neural networks. arXiv preprint arXiv:1606.04671, 2016.
  382. Gradient projection memory for continual learning. In International Conference on Learning Representations, 2020.
  383. Towards a natural benchmark for continual learning. In Advances in Neural Information Processing Systems Workshops, 2018.
  384. Progress & compress: A scalable framework for continual learning. In International Conference on Machine Learning, pages 4528–4537. PMLR, 2018.
  385. Continual learning in generative adversarial nets. arXiv preprint arXiv:1705.08395, 2017.
  386. Overcoming catastrophic forgetting with hard attention to the task. In International Conference on Machine Learning, pages 4548–4557. PMLR, 2018.
  387. Overcoming catastrophic forgetting beyond continual learning: Balanced training for neural machine translation. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2023–2036, 2022.
  388. A progressive model to enable continual learning for semantic slot filling. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 1279–1284, 2019.
  389. Overcoming catastrophic forgetting in incremental few-shot learning by finding flat minima. Advances in Neural Information Processing Systems, 34:6747–6761, 2021.
  390. Incremental few-shot semantic segmentation via embedding adaptive-update and hyper-class representation. In Proceedings of the ACM International Conference on Multimedia, pages 5547–5556, 2022.
  391. Mimicking the oracle: An initial phase decorrelation approach for class incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16722–16731, 2022.
  392. Learning with selective forgetting. In IJCAI, volume 2, page 6, 2021.
  393. Online class-incremental continual learning with adversarial shapley value. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 9630–9638, 2021.
  394. Continual learning with deep generative replay. Advances in Neural Information Processing Systems, 30, 2017.
  395. Incremental learning of object detectors without catastrophic forgetting. In Proceedings of the IEEE International Conference on Computer Vision, pages 3400–3409, 2017.
  396. Dlcft: Deep linear continual fine-tuning for general incremental learning. In European Conference on Computer Vision, pages 513–529. Springer, 2022.
  397. On generalizing beyond domains in cross-domain continual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9265–9274, 2022.
  398. On learning the geodesic path for incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1591–1600, 2021.
  399. Rectification-based knowledge retention for continual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 15282–15291, 2021.
  400. Calibrating cnns for lifelong learning. Advances in Neural Information Processing Systems, 33:15579–15590, 2020.
  401. Always be dreaming: A new approach for data-free class-incremental learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 9374–9384, 2021.
  402. Coda-prompt: Continual decomposed attention-based prompting for rehearsal-free continual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11909–11919, 2023.
  403. Score-based generative modeling through stochastic differential equations. In International Conference on Learning Representations, 2020.
  404. Climb: A continual learning benchmark for vision-and-language tasks. arXiv preprint arXiv:2206.09059, 2022.
  405. Dropout: a simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research, 15(1):1929–1958, 2014.
  406. Unsupervised model adaptation for continual semantic segmentation. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 2593–2601, 2021.
  407. Lamol: Language modeling for lifelong language learning. In International Conference on Learning Representations, 2019.
  408. Distill and replay for continual language learning. In Proceedings of the 28th international conference on computational linguistics, pages 3569–3579, 2020.
  409. Exploring example influence in continual learning. arXiv preprint arXiv:2209.12241, 2022.
  410. Information-theoretic online memory selection for continual learning. In International Conference on Learning Representations, 2021.
  411. Ernie 2.0: A continual pre-training framework for language understanding. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pages 8968–8975, 2020.
  412. Improving and understanding variational continual learning. arXiv preprint arXiv:1905.02099, 2019.
  413. Layerwise optimization by gradient decomposition for continual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9634–9643, 2021.
  414. Learning to imagine: Diversify memory for incremental learning using unlabeled data. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9549–9558, 2022.
  415. Topology-preserving class-incremental learning. In European Conference on Computer Vision, pages 254–270. Springer, 2020.
  416. Few-shot class-incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12183–12192, 2020.
  417. A deep hierarchical approach to lifelong learning in minecraft. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 31, 2017.
  418. Catastrophic forgetting and mode collapse in gans. In International Joint Conference on Neural Networks, pages 1–10. IEEE, 2020.
  419. Functional regularisation for continual learning with gaussian processes. In International Conference on Learning Representations, 2019.
  420. Gcr: Gradient coreset based replay buffer selection for continual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 99–108, 2022.
  421. Bring evanescent representations to life in lifelong class incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16732–16741, 2022.
  422. Natural variational continual learning.
  423. A modeling framework for adaptive lifelong learning with transfer and savings through gating in the prefrontal cortex. Proceedings of the National Academy of Sciences, 117(47):29872–29882, 2020.
  424. Brain-inspired replay for continual learning with artificial neural networks. Nature Communications, 11(1):1–14, 2020.
  425. Gido M Van de Ven and Andreas S Tolias. Three scenarios for continual learning. arXiv preprint arXiv:1904.07734, 2019.
  426. Neural discrete representation learning. Advances in Neural Information Processing Systems, 30, 2017.
  427. Prompt augmented generative replay via supervised contrastive learning for lifelong intent detection. In Findings of the Association for Computational Linguistics: NAACL 2022, pages 1113–1127, 2022.
  428. Attention is all you need. Advances in Neural Information Processing Systems, 30, 2017.
  429. Efficient continual learning with modular networks and task-driven priors. In International Conference on Learning Representations, 2020.
  430. Rehearsal revealed: The limits and merits of revisiting samples in continual learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 9385–9394, 2021.
  431. vclimb: A novel video class incremental learning benchmark. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 19035–19044, 2022.
  432. Jeffrey S Vitter. Random sampling with a reservoir. ACM Transactions on Mathematical Software (TOMS), 11(1):37–57, 1985.
  433. Continual learning with hypernetworks. In International Conference on Learning Representations, 2019.
  434. Scott Waddell. Neural plasticity: Dopamine tunes the mushroom body output network. Current Biology, 26(3):R109–R112, 2016.
  435. Mell: Large-scale extensible user intent classification for dialogue systems with meta lifelong learning. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, pages 3649–3659, 2021.
  436. Lifelong graph learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13719–13728, 2022.
  437. Foster: Feature boosting and compression for class-incremental learning. arXiv preprint arXiv:2204.04662, 2022.
  438. Wanderlust: Online continual object detection in the real world. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 10829–10838, 2021.
  439. Acae-remind for online continual learning with compressed feature replay. Pattern Recognition Letters, 150:122–129, 2021.
  440. Triple-memory networks: A brain-inspired method for continual learning. IEEE Transactions on Neural Networks and Learning Systems, 33(5):1925–1934, 2021.
  441. Ordisco: Effective and efficient usage of incremental unlabeled data for semi-supervised continual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5383–5392, 2021.
  442. Afec: Active forgetting of negative transfer in continual learning. Advances in Neural Information Processing Systems, 34:22379–22391, 2021.
  443. Coscl: Cooperation of small continual learners is stronger than a big one. In European Conference on Computer Vision, pages 254–271. Springer, 2022.
  444. Memory replay with data compression for continual learning. In International Conference on Learning Representations, 2021.
  445. Learngene: From open-world to your learning task. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 8557–8565, 2022.
  446. Anti-retroactive interference for lifelong learning. In European Conference on Computer Vision, pages 163–178. Springer, 2022.
  447. Few-shot class-incremental learning for named entity recognition. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 571–582, 2022.
  448. Training networks in null space of feature covariance for continual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 184–193, 2021.
  449. Incremental learning from scratch for task-oriented dialogue systems. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 3710–3720, 2019.
  450. S-prompts learning with pre-trained transformers: An occam’s razor for domain incremental learning. arXiv preprint arXiv:2207.12819, 2022.
  451. Continual learning with lifelong vision transformer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 171–181, 2022.
  452. Continual learning through retrieval and imagination. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 8, 2022.
  453. Online continual learning with contrastive vision transformer. In European Conference on Computer Vision, pages 631–650. Springer, 2022.
  454. Efficient meta lifelong-learning with limited memory. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 535–548, 2020.
  455. Improving task-free continual learning by distributionally robust memory evolution. In International Conference on Machine Learning, pages 22985–22998. PMLR, 2022.
  456. Meta-learning with less forgetting on large-scale non-stationary task distributions. In European Conference on Computer Vision, pages 221–238. Springer, 2022.
  457. Dualprompt: Complementary prompting for rehearsal-free continual learning. arXiv preprint arXiv:2204.04799, 2022.
  458. Learning to prompt for continual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 139–149, 2022.
  459. Continual learning with guarantees via weight interval constraints. In International Conference on Machine Learning, pages 23897–23911. PMLR, 2022.
  460. Disentangling transfer in continual reinforcement learning. arXiv preprint arXiv:2209.13900, 2022.
  461. Continual world: A robotic benchmark for continual reinforcement learning. Advances in Neural Information Processing Systems, 34:28496–28510, 2021.
  462. Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time. In International Conference on Machine Learning, pages 23965–23998. PMLR, 2022.
  463. Supermasks in superposition. Advances in Neural Information Processing Systems, 33:15173–15184, 2020.
  464. Memory replay gans: Learning to generate new categories without forgetting. Advances in Neural Information Processing Systems, 31, 2018.
  465. Striking a balance between stability and plasticity for class-incremental learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 1124–1133, 2021.
  466. Pretrained language model in continual learning: A comparative study. In International Conference on Learning Representations, 2021.
  467. Curriculum-meta learning for order-robust continual relation extraction. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 10363–10369, 2021.
  468. Class-incremental learning with strong pre-trained models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9601–9610, 2022.
  469. Large scale incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 374–382, 2019.
  470. Deltagrad: Rapid retraining of machine learning models. In International Conference on Machine Learning, pages 10355–10366. PMLR, 2020.
  471. Incremental learning via rate reduction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1125–1133, 2021.
  472. Incremental few-shot text classification with multi-round new classes: Formulation, dataset and system. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1351–1360, 2021.
  473. Incremental learning using conditional adversarial networks. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 6619–6628, 2019.
  474. General incremental learning with domain-aware categorical representations. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 14351–14360, 2022.
  475. Geometry of sequence working memory in macaque prefrontal cortex. Science, 375(6581):632–639, 2022.
  476. Reinforced continual learning. Advances in Neural Information Processing Systems, 31, 2018.
  477. Continual learning of control primitives: Skill discovery via reset-games. Advances in Neural Information Processing Systems, 33:4999–5010, 2020.
  478. Meta-attention for vit-backed continual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 150–159, 2022.
  479. Generative negative text replay for continual vision-language pretraining. In European Conference on Computer Vision, pages 22–38. Springer, 2022.
  480. Der: Dynamically expandable representation for class incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3014–3023, 2021.
  481. An em framework for online incremental learning of semantic segmentation. In Proceedings of the ACM International Conference on Multimedia, pages 3052–3060, 2021.
  482. Uncertainty-aware contrastive distillation for incremental semantic segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022.
  483. Stably maintained dendritic spines are associated with lifelong memories. Nature, 462(7275):920–924, 2009.
  484. Learning latent representations across multiple data domains using lifelong vaegan. In European Conference on Computer Vision, pages 777–795. Springer, 2020.
  485. Task-free continual learning via online discrepancy distance learning. arXiv preprint arXiv:2210.06579, 2022.
  486. Learning with recoverable forgetting. In European Conference on Computer Vision, pages 87–103. Springer, 2022.
  487. Mitigating forgetting in online continual learning with neuron calibration. Advances in Neural Information Processing Systems, 34:10260–10272, 2021.
  488. Dreaming to distill: Data-free knowledge transfer via deepinversion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8715–8724, 2020.
  489. Contintin: Continual learning from task instructions. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 3062–3072, 2022.
  490. Federated continual learning with weighted inter-client transfer. In International Conference on Machine Learning, pages 12073–12086. PMLR, 2021.
  491. Scalable and order-robust continual learning with additive parameter decomposition. In International Conference on Learning Representations, 2019.
  492. Online coreset selection for rehearsal-based continual learning. In International Conference on Learning Representations, 2021.
  493. Lifelong learning with dynamically expandable networks. In International Conference on Learning Representations, 2018.
  494. Continual learning by modeling intra-class variation. arXiv preprint arXiv:2210.05398, 2022.
  495. Self-training for class-incremental semantic segmentation. IEEE Transactions on Neural Networks and Learning Systems, 2022.
  496. Semantic drift compensation for class-incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6982–6991, 2020.
  497. Continual learning of context-dependent processing in neural networks. Nature Machine Intelligence, 1(8):364–372, 2019.
  498. Continual learning through synaptic intelligence. In International Conference on Machine Learning, pages 3987–3995. PMLR, 2017.
  499. Piggyback gan: Efficient lifelong learning for image conditioned generation. In European Conference on Computer Vision, pages 397–413. Springer, 2020.
  500. Hyper-lifelonggan: scalable lifelong learning for image conditioned generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2246–2255, 2021.
  501. Lifelong gan: Continual learning for conditional image generation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 2759–2768, 2019.
  502. Few-shot incremental learning with continually evolved classifiers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12455–12464, 2021.
  503. Representation compensation networks for continual semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7053–7064, 2022.
  504. Slca: Slow learner with classifier alignment for continual learning on a pre-trained model. arXiv preprint arXiv:2303.05118, 2023.
  505. Mixup: Beyond empirical risk minimization. In International Conference on Learning Representations, 2018.
  506. Class-incremental learning via deep model consolidation. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 1131–1140, 2020.
  507. Side-tuning: a baseline for network adaptation via additive side networks. In European Conference on Computer Vision, pages 698–714. Springer, 2020.
  508. Active protection: Learning-activated raf/mapk activity protects labile memory from rac1-independent forgetting. Neuron, 98(1):142–155, 2018.
  509. Cglb: Benchmark tasks for continual graph learning. In Thirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track, 2022.
  510. Epicker is an exemplar-based continual learning approach for knowledge accumulation in cryoem particle picking. Nature Communications, 13(1):1–10, 2022.
  511. A simple but strong baseline for online continual learning: Repeated augmented rehearsal. arXiv preprint arXiv:2209.13917, 2022.
  512. Continual sequence generation with adaptive compositional modules. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 3653–3667, 2022.
  513. Maintaining discrimination and fairness in class incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13208–13217, 2020.
  514. Mgsvf: Multi-grained slow vs. fast framework for few-shot class-incremental learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021.
  515. Genetic dissection of mutual interference between two consecutive learning tasks in drosophila. Elife, 12:e83516, 2023.
  516. On leveraging pretrained gans for generation with limited data. In International Conference on Machine Learning, pages 11340–11351. PMLR, 2020.
  517. Static-dynamic co-teaching for class-incremental 3d object detection. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 3436–3445, 2022.
  518. Forward compatible few-shot class-incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9046–9056, 2022.
  519. Few-shot class-incremental learning by sampling multi-phase tasks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022.
  520. Co-transport for class-incremental learning. In Proceedings of the ACM International Conference on Multimedia, pages 1645–1654, 2021.
  521. Image de-raining via continual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4907–4916, 2021.
  522. Objects as points. arXiv preprint arXiv:1904.07850, 2019.
  523. Class-incremental learning via dual augmentation. Advances in Neural Information Processing Systems, 34:14306–14318, 2021.
  524. Prototype augmentation and self-supervision for incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5871–5880, 2021.
  525. Self-promoted prototype refinement for few-shot class-incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6801–6810, 2021.
  526. Self-sustaining representation expansion for non-exemplar class-incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9296–9305, 2022.
  527. Continual prompt tuning for dialog state tracking. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1124–1137, 2022.
  528. Margin-based few-shot class-incremental learning with class-level overfitting mitigation. arXiv preprint arXiv:2210.04524, 2022.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Liyuan Wang (33 papers)
  2. Xingxing Zhang (65 papers)
  3. Hang Su (224 papers)
  4. Jun Zhu (424 papers)
Citations (388)

Summary

Continual Learning: A Comprehensive Survey of Theory, Methods, and Applications

Continual learning (CL), also known as incremental learning or lifelong learning, is an area of machine learning that addresses the challenge of learning from non-stationary data where an intelligent system needs to acquire, update, and exploit knowledge incrementally. This ability is fundamentally constrained by the phenomenon of catastrophic forgetting, where learning new information can lead to the degradation of performance on previously learned tasks. Over the years, several significant advancements have been made to extend our understanding and enhance the application of continual learning, as evidenced by a comprehensive survey conducted by Wang et al.

Overview of Key Contributions

The survey by Wang et al. provides an extensive taxonomy of state-of-the-art CL methods by systematically categorizing them into five major approaches: regularization-based, replay-based, optimization-based, representation-based, and architecture-based. This taxonomy facilitates an understanding of how various CL strategies are adapted to address specific challenges in practical applications.

Regularization-Based Approach

This approach is marked by the introduction of explicit regularization terms in the loss function to balance new and existing tasks. It has two primary subcategories: weight regularization and function regularization. Weight regularization penalizes changes in parameters critical for previous tasks using metrics like the Fisher Information Matrix (FIM), as implemented by methods like EWC and its variants. On the other hand, function regularization focuses on distilling knowledge from the previous model to the current model, as seen in methods like LwF and its extensions.

Replay-Based Approach

Replay-based methods aim to approximate and recover older data distributions, crucially countering catastrophic forgetting. This category is subdivided into experience replay, generative replay, and feature replay. Experience replay retains a limited set of older training samples, generative replay utilizes sampled data from generative models, and feature replay employs statistical methods over feature space. Although promising, replay-based methods face challenges such as designing efficient memory buffers and overfitting to replayed samples.

Optimization-Based Approach

Optimization-based methods manipulate the optimization process more explicitly, such as through gradient projection to balance stability and plasticity. This also includes meta-learning strategies, optimizing gradient directions based on experience, and developing robust loss landscapes that facilitate task transitions while minimizing interference, as demonstrated by methods like OML and related works.

Representation-Based and Architecture-Based Approaches

Representation-based methods focus on leveraging robust representations, often gained through self-supervised learning or pre-training, to improve generalization and minimize forgetting. Properly leveraging the representations in continual learning enhances model robustness significantly. Architecture-based approaches involve constructing adaptive task-specific architectures or modular networks to segregate tasks, thus avoiding interference, as seen with methods like Progressive Networks.

Implications and Outlook

Wang et al.’s survey provides insights into effective strategies for addressing the stability-plasticity dilemma and improving generalizability across tasks. It highlights the importance of general objectives such as ensuring proper stability-plasticity trade-off and adequate intra/inter-task generalizability while considering resource efficiency. The discussion on practical applications of continual learning in areas like object detection, semantic segmentation, and reinforcement learning underscores the broad applicability and real-world relevance of CL research.

The survey also underscores the increasing use of pre-training and self-supervised learning for obtaining robust initial representations, which have shown significant reduction in the impact of catastrophic forgetting. This paves the way for more cross-domain and interdisciplinary applications in heterogeneous data contexts, potentially integrating neural architecture search, efficient memory utilization, and context-aware systems.

Looking forward, continued research in CL is expected to refine its theoretical foundations and extend its applications across diverse domains, from foundational AI systems to neuroscientific studies of biological learning. Advances at the intersection of CL and robust large-scale models, like transformers in foundational AI, hold the potential for rich, multi-modal learning systems capable of adapting to rapidly changing environments with minimal interference.

Youtube Logo Streamline Icon: https://streamlinehq.com