2000 character limit reached
Continual Training of Language Models for Few-Shot Learning (2210.05549v1)
Published 11 Oct 2022 in cs.CL, cs.AI, cs.LG, and cs.NE
Abstract: Recent work on applying LLMs (LMs) achieves impressive performance in many NLP applications. Adapting or posttraining an LM using an unlabeled domain corpus can produce even better performance for end-tasks in the domain. This paper proposes the problem of continually extending an LM by incrementally post-train the LM with a sequence of unlabeled domain corpora to expand its knowledge without forgetting its previous skills. The goal is to improve the few-shot end-task learning in these domains. The resulting system is called CPT (Continual PostTraining), which to our knowledge, is the first continual post-training system. Experimental results verify its effectiveness.
- Zixuan Ke (26 papers)
- Haowei Lin (21 papers)
- Yijia Shao (18 papers)
- Hu Xu (87 papers)
- Lei Shu (82 papers)
- Bing Liu (211 papers)