Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Recent Advances of Foundation Language Models-based Continual Learning: A Survey (2405.18653v2)

Published 28 May 2024 in cs.CL

Abstract: Recently, foundation LLMs (LMs) have marked significant achievements in the domains of NLP and computer vision (CV). Unlike traditional neural network models, foundation LMs obtain a great ability for transfer learning by acquiring rich commonsense knowledge through pre-training on extensive unsupervised datasets with a vast number of parameters. However, they still can not emulate human-like continuous learning due to catastrophic forgetting. Consequently, various continual learning (CL)-based methodologies have been developed to refine LMs, enabling them to adapt to new tasks without forgetting previous knowledge. However, a systematic taxonomy of existing approaches and a comparison of their performance are still lacking, which is the gap that our survey aims to fill. We delve into a comprehensive review, summarization, and classification of the existing literature on CL-based approaches applied to foundation LLMs, such as pre-trained LLMs (PLMs), LLMs and vision-LLMs (VLMs). We divide these studies into offline CL and online CL, which consist of traditional methods, parameter-efficient-based methods, instruction tuning-based methods and continual pre-training methods. Offline CL encompasses domain-incremental learning, task-incremental learning, and class-incremental learning, while online CL is subdivided into hard task boundary and blurry task boundary settings. Additionally, we outline the typical datasets and metrics employed in CL research and provide a detailed analysis of the challenges and future work for LMs-based continual learning.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (235)
  1. Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023).
  2. Rapid Adaptation in Online Continual Learning: Are We Evaluating It Right?. In ICCV. 18852–18861.
  3. Online Continual Learning with Maximal Interfered Retrieval. In NeurIPS, Vol. 32.
  4. Training a helpful and harmless assistant with reinforcement learning from human feedback. arXiv preprint arXiv:2204.05862 (2022).
  5. Online continual learning on a contaminated data stream with blurry task boundaries. In CVPR. 9275–9284.
  6. Learning to continually learn. arXiv preprint arXiv:2002.09571 (2020).
  7. A comprehensive study of class incremental learning algorithms for visual tasks. Neural Networks 135 (2021), 38–54.
  8. Continual Lifelong Learning in Natural Language Processing: A Survey. In COLING. 6523–6541.
  9. Raad Bin Tareaf. 2017. Tweets Dataset - Top 20 most followed users in Twitter social platform.
  10. Léonard Blier and Yann Ollivier. 2018. The description length of deep learning models. NeurIPS 31 (2018).
  11. Large-scale simple question answering with memory networks. arXiv preprint arXiv:1506.02075 (2015).
  12. MultiWOZ–a large-scale multi-domain wizard-of-oz dataset for task-oriented dialogue modelling. arXiv preprint arXiv:1810.00278 (2018).
  13. Taskmaster-1: Toward a realistic and diverse dialog dataset. arXiv preprint arXiv:1909.05358 (2019).
  14. New insights on reducing abrupt representation change in online continual learning. arXiv preprint arXiv:2104.05025 (2021).
  15. InstructAlign: High-and-Low Resource Language Alignment via Continual Crosslingual Instruction Tuning. In the First Workshop in South East Asian Language Processing. 55–78.
  16. Online continual learning with natural distribution shifts: An empirical study with visual data. In ICCV. 8281–8290.
  17. e-snli: Natural language inference with natural language explanations. NeurIPS 31 (2018).
  18. Generative Multi-modal Models are Good Class-Incremental Learners. arXiv preprint arXiv:2403.18383 (2024).
  19. Efficient intent detection with dual sentence encoders. arXiv preprint arXiv:2003.04807 (2020).
  20. Learning to Solve NLP Tasks in an Incremental Number of Languages. In ACL. 837–847.
  21. Riemannian walk for incremental learning: Understanding forgetting and intransigence. In ECCV. 532–547.
  22. Efficient Lifelong Learning with A-GEM. In ICLR.
  23. CoIN: A Benchmark of Continual Instruction tuNing for Multimodel Large Language Model. arXiv preprint arXiv:2403.08350 (2024).
  24. Mitigating forgetting in online continual learning via instance-aware parameterization. NeurIPS 33 (2020), 17466–17477.
  25. Lifelong language pretraining with distribution-specialized experts. In ICML. PMLR, 5383–5395.
  26. Adapting large language models via reading comprehension. arXiv preprint arXiv:2309.09530 (2023).
  27. Decouple knowledge from paramters for plug-and-play language modeling. In ACL. 14288–14308.
  28. QuAC: Question Answering in Context. In EMNLP. 2174–2184.
  29. Aristotelis Chrysakis and Marie-Francine Moens. 2023. Online bias correction for task-free continual learning. ICLR (2023).
  30. Continual pre-training mitigates forgetting in language and vision. arXiv preprint arXiv:2205.09357 (2022).
  31. Snips voice platform: an embedded spoken language understanding system for private-by-design voice interfaces. arXiv preprint arXiv:1805.10190 (2018).
  32. Continual Vision-Language Retrieval via Dynamic Knowledge Rectification. In AAAI, Vol. 38. 11704–11712.
  33. Probing representation forgetting in supervised and unsupervised continual learning. In CVPR. 16712–16721.
  34. A continual learning survey: Defying forgetting in classification tasks. TPAMI 44, 7 (2021), 3366–3385.
  35. Episodic memory in lifelong language learning. NeurIPS 32 (2019).
  36. CL-MASR: A Continual Learning Benchmark for Multilingual ASR. arXiv preprint arXiv:2310.16931 (2023).
  37. Lifelonger: A benchmark for continual disease classification. In International Conference on Medical Image Computing and Computer-Assisted Intervention. 314–324.
  38. A holistic lexicon-based approach to opinion mining. In WSDM. 231–240.
  39. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2022).
  40. Podnet: Pooled outputs distillation for small-tasks incremental learning. In ECCV. 86–102.
  41. From Static to Dynamic: A Continual Learning Framework for Large Language Models. arXiv preprint arXiv:2310.14248 (2023).
  42. A Survey of Vision-Language Pre-Trained Models. In IJCAI. 5436–5443.
  43. Cynthia Dwork. 2006. Differential privacy. In International colloquium on automata, languages, and programming. 1–12.
  44. Understanding dataset difficulty with V-usable information. In ICML. 5988–6008.
  45. From MNIST to ImageNet and Back: Benchmarking Continual Curriculum Learning. arXiv preprint arXiv:2303.11076 (2023).
  46. Online continual learning under extreme memory constraints. In ECCV 2020. 720–735.
  47. Continual Learning for Task-oriented Dialogue System with Iterative Network Pruning, Expanding and Masking. In ACL. 517–523.
  48. A study of continual learning under language shift. arXiv preprint arXiv:2311.01200 (2023).
  49. Oded Goldreich. 1998. Secure multi-party computation. Manuscript. Preliminary version 78, 110 (1998), 1–108.
  50. Making the v in vqa matter: Elevating the role of image understanding in visual question answering. In CVPR. 6904–6913.
  51. The iapr tc-12 benchmark: A new evaluation resource for visual information systems. In International workshop ontoImage, Vol. 2.
  52. Not just selection, but exploration: Online class-incremental continual learning via dual view consistency. In CVPR. 7442–7451.
  53. Survey on Online Streaming Continual Learning. In IJCAI.
  54. Online continual learning through mutual information maximization. In ICML. 8109–8126.
  55. Dealing with Cross-Task Class Discrimination in Online Continual Learning. In CVPR. 11878–11887.
  56. Semantic parsing for task oriented dialog using hierarchical representations. arXiv preprint arXiv:1810.07942 (2018).
  57. DEMix Layers: Disentangling Domains for Modular Language Modeling. In NAACL. 5557–5576.
  58. FewRel: A large-scale supervised few-shot relation classification dataset with state-of-the-art evaluation. arXiv preprint arXiv:1810.10147 (2018).
  59. Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey. arXiv preprint arXiv:2403.14608 (2024).
  60. Replay in deep learning: Current approaches and missing biological elements. Neural computation 33, 11 (2021), 2908–2950.
  61. Incremental learning in online scenario. In CVPR. 13926–13935.
  62. Ruining He and Julian McAuley. 2016. Ups and downs: Modeling the visual evolution of fashion trends with one-class collaborative filtering. In WWW. 507–517.
  63. Meta-learning with sparse experience replay for lifelong language learning. arXiv preprint arXiv:2009.04891 (2020).
  64. Parameter-efficient transfer learning for NLP. In ICML. 2790–2799.
  65. OntoNotes: The 90% Solution. In NAACL. 57–60.
  66. Drinking from a firehose: Continual learning with web-scale natural language. TPAMI 45, 5 (2022), 5684–5696.
  67. Minqing Hu and Bing Liu. 2004. Mining and summarizing customer reviews. In SIGKDD. 168–177.
  68. Continual Learning for Text Classification with Information Disentanglement Based Regularization. In ACL. 2736–2746.
  69. Codesearchnet challenge: Evaluating the state of semantic code search. arXiv preprint arXiv:1909.09436 (2019).
  70. Mapping language to code in programmatic context. arXiv preprint arXiv:1808.09588 (2018).
  71. Exploring the benefits of training expert language models over instruction tuning. In ICML. 14702–14729.
  72. TemporalWiki: A Lifelong Benchmark for Training and Evaluating Ever-Evolving Language Models. In EMNLP. 6237–6250.
  73. Towards Continual Knowledge Learning of Language Models. In ICLR.
  74. Khurram Javed and Martha White. 2019. Meta-learning representations for continual learning. NeurIPS 32 (2019).
  75. CLAP4CLIP: Continual Learning with Probabilistic Finetuning for Vision-Language Models. arXiv preprint arXiv:2403.19137 (2024).
  76. Learn Continually, Generalize Rapidly: Lifelong Knowledge Accumulation for Few-shot Learning. In EMNLP. 714–729.
  77. Gradient-based editing of memory examples for online task-free continual learning. NeurIPS 34 (2021), 29193–29205.
  78. Lifelong Pretraining: Continually Adapting Language Models to Emerging Corpora. In NAACL. 4764–4780.
  79. TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension. In ACL. 1601–1611.
  80. Kushal Kafle and Christopher Kanan. 2017. An analysis of visual question answering algorithms. In ICCV. 1965–1973.
  81. Continual Training of Language Models for Few-Shot Learning. In EMNLP. 10205–10216.
  82. Zixuan Ke and Bing Liu. 2022. Continual learning of natural language processing tasks: A survey. arXiv preprint arXiv:2211.12701 (2022).
  83. Achieving Forgetting Prevention and Knowledge Transfer in Continual Learning. In NeurIPS, Vol. 34. 22443–22456.
  84. CLASSIC: Continual and Contrastive Learning of Aspect Sentiment Classification Tasks. In EMNLP.
  85. Continual Pre-training of Language Models. In ICLR.
  86. Zixuan Ke and Hu Xu. 2021. Adapting BERT for Continual Learning of a Sequence of Aspect Sentiment Classification Tasks. In NAACL.
  87. Measuring catastrophic forgetting in neural networks. In AAAI, Vol. 32.
  88. Jacob Devlin Ming-Wei Chang Kenton and Lee Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL. 4171–4186.
  89. Introducing language guidance in prompt-based continual learning. In ICCV. 11463–11473.
  90. VLM-PL: Advanced Pseudo Labeling approach Class Incremental Object Detection with Vision-Language Model. arXiv preprint arXiv:2403.05346 (2024).
  91. Task Relation-aware Continual User Representation Learning. In SIGKDD. 1107–1119.
  92. Overcoming catastrophic forgetting in neural networks. the national academy of sciences 114, 13 (2017), 3521–3526.
  93. BeGin: Extensive Benchmark Scenarios and An Easy-to-use Framework for Graph Continual Learning. arXiv preprint arXiv:2211.14568 (2022).
  94. Online continual learning on class incremental blurry task configuration with anytime inference. arXiv preprint arXiv:2110.10031 (2021).
  95. Online Boundary-Free Continual Learning by Scheduled Data Prior. In ICLR.
  96. Hdltex: Hierarchical deep learning for text classification. In ICMLA. 364–371.
  97. Ken Lang. 1995. Newsweeder: Learning to filter netnews. In Machine learning proceedings. 331–339.
  98. An evaluation dataset for intent classification and out-of-scope prediction. arXiv preprint arXiv:1909.02027 (2019).
  99. Do pre-trained models benefit equally in continual learning?. In WCACV. 6485–6493.
  100. Sungjin Lee. 2017. Toward continual learning for conversational agents. arXiv preprint arXiv:1712.09943 (2017).
  101. Continual learning for robotics: Definition, framework, learning strategies, opportunities and challenges. Information fusion 58 (2020), 52–68.
  102. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension. In ACL.
  103. Overcoming catastrophic forgetting during domain adaptation of seq2seq language generation. In NAACL. 5441–5454.
  104. Alime assist: An intelligent assistant for creating an innovative e-commerce experience. In CIKM. 2495–2498.
  105. Continual few-shot intent detection. In COLING. 333–343.
  106. Visualbert: A simple and performant baseline for vision and language. arXiv preprint arXiv:1908.03557 (2019).
  107. PCR: Proxy-based Contrastive Replay for Online Class-Incremental Continual Learning. In CVPR. 24246–24255.
  108. Microsoft coco: Common objects in context. In ECCV. 740–755.
  109. The clear benchmark: Continual learning on real-world imagery. In NeurIPS.
  110. Bing Liu. 2020. Learning on the job: Online lifelong and continual learning. In AAAI, Vol. 34. 13544–13549.
  111. Bing Liu and Sahisnu Mazumder. 2021. Lifelong and continual learning dialogue systems: learning during conversation. In AAAI, Vol. 35. 15058–15063.
  112. AI Autonomy: Self-initiated Open-world Continual Learning and Adaptation. AI Magazine (2023).
  113. Visual instruction tuning. NeurIPS 36 (2024).
  114. Automated rule selection for aspect extraction in opinion mining. In IJCAI.
  115. Lifelong intent detection via multi-strategy rebalancing. arXiv preprint arXiv:2108.04445 (2021).
  116. Continual Learning for Sentence Representations Using Conceptors. In NAACL. 3274–3279.
  117. Class Incremental Learning with Pre-trained Vision-Language Models. arXiv preprint arXiv:2310.20348 (2023).
  118. Benchmarking natural language understanding services for building conversational agents. In International Workshop on Spoken Dialogue Systems. 165–183.
  119. P-Tuning: Prompt Tuning Can Be Comparable to Fine-tuning Across Scales and Tasks. In ACL. 61–68.
  120. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019).
  121. S2ORC: The Semantic Scholar Open Research Corpus. In ACL. 4969–4983.
  122. David Lopez-Paz and Marc’Aurelio Ranzato. 2017. Gradient episodic memory for continual learning. NeurIPS 30 (2017).
  123. Codexglue: A machine learning benchmark dataset for code understanding and generation. arXiv preprint arXiv:2102.04664 (2021).
  124. Exploring models and data for remote sensing image caption generation. IEEE Transactions on Geoscience and Remote Sensing 56, 4 (2017), 2183–2195.
  125. An Empirical Study of Catastrophic Forgetting in Large Language Models During Continual Fine-tuning. arXiv e-prints (2023).
  126. EcomGPT-CT: Continual pre-training of e-commerce large language models with semi-structured data. arXiv preprint arXiv:2312.15696 (2023).
  127. Learning Word Vectors for Sentiment Analysis. In ACL. 142–150.
  128. Continual learning in task-oriented dialogue systems. arXiv preprint arXiv:2012.15504 (2020).
  129. Generative Replay Inspired by Hippocampal Memory Indexing for Continual Language Learning. In EACL. 930–942.
  130. Online continual learning in image classification: An empirical survey. Neurocomputing 469 (2022), 28–51.
  131. Class-incremental learning: survey and performance evaluation on image classification. TPAMI 45, 5 (2022), 5513–5533.
  132. Sahisnu Mazumder and Bing Liu. 2024. Lifelong and Continual Learning Dialogue Systems. Springer Nature.
  133. The natural language decathlon: Multitask learning as question answering. arXiv 2018. arXiv preprint arXiv:1806.08730 ([n. d.]).
  134. An Empirical Investigation of the Role of Pre-training in Lifelong Learning. (2021).
  135. Online Continual Learning in Keyword Spotting for Low-Resource Devices via Pooling High-Order Temporal Statistics. arXiv preprint arXiv:2307.12660 (2023).
  136. Recent advances in natural language processing via large pre-trained language models: A survey. Comput. Surveys 56, 2 (2023), 1–40.
  137. Large-scale Lifelong Learning of In-context Instructions and How to Tackle It. In ACL. 12573–12589.
  138. Continual learning for named entity recognition. In AAAI, Vol. 35. 13570–13577.
  139. Online Class Incremental Learning on Stochastic Blurry Task Boundary via Mask and Visual Prompt Tuning. In ICCV. 11731–11741.
  140. A Comprehensive Analysis of Adapter Efficiency. In IKDD. 136–154.
  141. Continual vision-language representation learning with off-diagonal information. In ICML. 26129–26149.
  142. The E2E dataset: New challenges for end-to-end generation. arXiv preprint arXiv:1706.09254 (2017).
  143. Lifelong neural predictive coding: Learning cumulatively online without forgetting. NeurIPS 35 (2022), 5867–5881.
  144. Training language models to follow instructions with human feedback. NeurIPS 35 (2022), 27730–27744.
  145. Continual lifelong learning with neural networks: A review. Neural networks 113 (2019), 54–71.
  146. Scalable Language Model with Generalized Continual Learning. In ICLR.
  147. Adapters: A Unified Library for Parameter-Efficient and Modular Transfer Learning. In EMNLP. 149–160.
  148. Decouple before interact: Multi-modal prompt learning for continual visual question answering. In ICCV. 2953–2962.
  149. Chengwei Qin and Shafiq Joty. 2022a. Continual Few-shot Relation Learning via Embedding Space Regularization and Data Augmentation. In ACL. 2776–2789.
  150. Chengwei Qin and Shafiq Joty. 2022b. LFPT5: A Unified Framework for Lifelong Few-shot Language Learning Based on Prompt Tuning of T5. In ICLR.
  151. Recyclable Tuning for Continual Pre-training. In ACL. 11403–11426.
  152. ELLE: Efficient Lifelong Pre-training for Emerging Data. In ACL. 2789–2810.
  153. Recent advances of continual learning in computer vision: An overview. arXiv preprint arXiv:2109.11369 (2021).
  154. Learning transferable visual models from natural language supervision. In ICML. PMLR, 8748–8763.
  155. SQuAD: 100,000+ Questions for Machine Comprehension of Text. In EMNLP. 2383–2392.
  156. Zero-shot text-to-image generation. In ICML. Pmlr, 8821–8831.
  157. Towards scalable multi-domain conversational agents: The schema-guided dialogue dataset. In AAAI, Vol. 34. 8689–8696.
  158. Progressive Prompts: Continual Learning for Language Models. In ICLR.
  159. iCaRL: Incremental Classifier and Representation Learning. In CVPR.
  160. Erik F Sang and Fien De Meulder. 2003. Introduction to the CoNLL-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003).
  161. Bloom: A 176b-parameter open-access multilingual language model. arXiv preprint arXiv:2211.05100 (2022).
  162. Cross-lingual transfer learning for multilingual task oriented dialog. arXiv preprint arXiv:1810.13327 (2018).
  163. Fine-tuned Language Models are Continual Learners. In EMNLP. 6107–6122.
  164. Fine-tuned language models are continual learners. In EMNLP. 6107–6122.
  165. Get to the point: Summarization with pointer-generator networks. arXiv preprint arXiv:1704.04368 (2017).
  166. Continual-learning-as-a-service (claas): On-demand efficient adaptation of predictive models. arXiv preprint arXiv:2206.06957 (2022).
  167. Continual learning for real-world autonomous systems: Algorithms, challenges and frameworks. Journal of Intelligent & Robotic Systems 105, 1 (2022), 9.
  168. A progressive model to enable continual learning for semantic slot filling. In EMNLP. 1279–1284.
  169. Online class-incremental continual learning with adversarial shapley value. In AAAI, Vol. 35. 9630–9638.
  170. ConPET: Continual Parameter-Efficient Tuning for Large Language Models. arXiv preprint arXiv:2309.14763 (2023).
  171. LAMOL: LAnguage MOdeling for Lifelong Language Learning. In ICLR. OpenReview.net.
  172. Ernie 2.0: A continual pre-training framework for language understanding. In AAAI, Vol. 34. 8968–8975.
  173. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023).
  174. An empirical study on learning bug-fixing patches in the wild via neural machine translation. TOSEM 28, 4 (2019), 1–29.
  175. Three types of incremental learning. Nature Machine Intelligence 4, 12 (2022), 1185–1197.
  176. Steven Vander Eeckt et al. 2023. Rehearsal-Free Online Continual Learning for Automatic Speech Recognition. arXiv e-prints (2023), arXiv–2306.
  177. Prompt augmented generative replay via supervised contrastive learning for lifelong intent detection. In NAACL. 1113–1127.
  178. Clad: A realistic continual learning benchmark for autonomous driving. Neural Networks 161 (2023), 659–669.
  179. TL;DR: Mining Reddit to Learn Automatic Summarization. In the Workshop on New Frontiers in Summarization. 59–63.
  180. Superglue: A stickier benchmark for general-purpose language understanding systems. NeurIPS 32 (2019).
  181. GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding. In EMNLP. 353–355.
  182. Mell: Large-scale extensible user intent classification for dialogue systems with meta lifelong learning. In SIGKDD. 3649–3659.
  183. RVAE-LAMOL: Residual Variational Autoencoder to Enhance Lifelong Language Learning. In IJCNN. IEEE, 1–9.
  184. Sentence Embedding Alignment for Lifelong Relation Extraction. In NAACL. 796–806.
  185. Wanderlust: Online continual object detection in the real world. In ICCV. 10829–10838.
  186. A comprehensive survey of continual learning: Theory, method and application. TPAMI (2024).
  187. CBA: Improving Online Continual Learning via Continual Bias Adaptor. In ICCV. 19082–19092.
  188. Large-scale multi-modal pre-trained models: A comprehensive survey. Machine Intelligence Research 20, 4 (2023), 447–482.
  189. Orthogonal Subspace Learning for Language Model Continual Learning. In EMNLP. 10658–10671.
  190. TRACE: A Comprehensive Benchmark for Continual Learning in Large Language Models.
  191. S-prompts learning with pre-trained transformers: An occam’s razor for domain incremental learning. NeurIPS 35 (2022), 5682–5695.
  192. Super-naturalinstructions: Generalization via declarative instructions on 1600+ nlp tasks. arXiv preprint arXiv:2204.07705 (2022).
  193. Rehearsal-free Continual Language Learning via Efficient Parameter Isolation. In ACL. 10933–10946.
  194. Efficient Meta Lifelong-Learning with Limited Memory. In EMNLP. 535–548.
  195. Online Prototype Learning for Online Continual Learning. In ICCV. 18764–18774.
  196. Semantically conditioned lstm-based natural language generation for spoken dialogue systems. arXiv preprint arXiv:1508.01745 (2015).
  197. Domain-Agnostic Neural Architecture for Class Incremental Continual Learning in Document Processing Platform. In ACL. 527–537.
  198. Continual world: A robotic benchmark for continual reinforcement learning. NeurIPS 34 (2021), 28496–28510.
  199. Online Continual Knowledge Learning for Language Models. arXiv preprint arXiv:2311.09632 (2023).
  200. Incremental few-shot text classification with multi-round new classes: Formulation, dataset and system. arXiv preprint arXiv:2104.11882 (2021).
  201. Efficient continual pre-training for building domain specific large language models. arXiv preprint arXiv:2311.08545 (2023).
  202. BERT post-training for review reading comprehension and aspect-based sentiment analysis. arXiv preprint arXiv:1904.02232 (2019).
  203. Self-taught convolutional neural networks for short text clustering. Neural Networks 88 (2017), 22–31.
  204. Exploring continual learning for code generation models. arXiv preprint arXiv:2307.02435 (2023).
  205. M6-t: Exploring sparse expert models and beyond. arXiv preprint arXiv:2105.15082 (2021).
  206. Towards General Purpose Medical AI: Continual Learning Medical Foundation Model. arXiv preprint arXiv:2303.06580 (2023).
  207. Mitigating forgetting in online continual learning with neuron calibration. NeurIPS 34 (2021), 10260–10272.
  208. ConTinTin: Continual Learning from Task Instructions. In ACL. 3062–3072.
  209. Learning and evaluating general linguistic intelligence. arXiv preprint arXiv:1901.11373 (2019).
  210. Online coreset selection for rehearsal-based continual learning. arXiv preprint arXiv:2106.01085 (2021).
  211. From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions. TACL 2 (2014), 67–78.
  212. Boosting Continual Learning of Vision-Language Models via Mixture-of-Experts Adapters. In CVPR.
  213. Parameter-efficient transfer from sequential behaviors for user modeling and recommendation. In SIGIR. 1469–1478.
  214. One Person, One Model, One World: Learning Continual User Representation without Forgetting. 696–705.
  215. Defending against neural fake news. NeurIPS 32 (2019).
  216. A survey on federated learning. Knowledge-Based Systems 216 (2021), 106775.
  217. Copf: Continual learning human preference through optimal policy fitting. arXiv preprint arXiv:2310.15694 (2023).
  218. CPPO: Continual Learning for Reinforcement Learning with Human Feedback. In ICLR.
  219. Peiyan Zhang and Sunghun Kim. 2023. A Survey on Incremental Update for Neural Recommender Systems. arXiv preprint arXiv:2303.02851 (2023).
  220. Vqacl: A novel visual question answering continual learning setting. In CVPR. 19102–19112.
  221. Character-level convolutional networks for text classification. NeurIPS 28 (2015).
  222. A simple but strong baseline for online continual learning: Repeated augmented rehearsal. NeurIPS 35 (2022), 14771–14783.
  223. Continual Sequence Generation with Adaptive Compositional Modules. In ACL. 3653–3667.
  224. Reformulating Domain Adaptation of Large Language Models as Adapt-Retrieve-Revise. arXiv preprint arXiv:2310.03328 (2023).
  225. Citb: A benchmark for continual instruction tuning. arXiv preprint arXiv:2310.14510 (2023).
  226. A survey of large language models. arXiv preprint arXiv:2303.18223 (2023).
  227. Prompt Conditioned VAE: Enhancing Generative Replay for Lifelong Learning in Task-Oriented Dialogue. In EMNLP. 11153–11169.
  228. Preventing zero-shot transfer degradation in continual learning of vision-language models. In ICCV. 19125–19136.
  229. Seq2sql: Generating structured queries from natural language using reinforcement learning. arXiv preprint arXiv:1709.00103 (2017).
  230. Deep class-incremental learning: A survey. arXiv preprint arXiv:2302.03648 (2023).
  231. Learning without forgetting for vision-language models. arXiv preprint arXiv:2305.19270 (2023).
  232. ChatGPT: potential, prospects, and limitations. Frontiers of Information Technology & Electronic Engineering (2023), 1–6.
  233. Ctp: Towards vision-language continual pretraining via compatible momentum contrast and topology preservation. In ICCV. 22257–22267.
  234. Continual Prompt Tuning for Dialog State Tracking. In ACL. 1124–1137.
  235. Aligning books and movies: Towards story-like visual explanations by watching movies and reading books. In ICCV. 19–27.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Yutao Yang (3 papers)
  2. Jie Zhou (687 papers)
  3. Xuanwen Ding (4 papers)
  4. Tianyu Huai (4 papers)
  5. Shunyu Liu (47 papers)
  6. Qin Chen (57 papers)
  7. Liang He (202 papers)
  8. Yuan Xie (188 papers)
Citations (6)
X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets