Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

G-MAP: General Memory-Augmented Pre-trained Language Model for Domain Tasks (2212.03613v3)

Published 7 Dec 2022 in cs.CL

Abstract: Recently, domain-specific PLMs have been proposed to boost the task performance of specific domains (e.g., biomedical and computer science) by continuing to pre-train general PLMs with domain-specific corpora. However, this Domain-Adaptive Pre-Training (DAPT; Gururangan et al. (2020)) tends to forget the previous general knowledge acquired by general PLMs, which leads to a catastrophic forgetting phenomenon and sub-optimal performance. To alleviate this problem, we propose a new framework of General Memory Augmented Pre-trained LLM (G-MAP), which augments the domain-specific PLM by a memory representation built from the frozen general PLM without losing any general knowledge. Specifically, we propose a new memory-augmented layer, and based on it, different augmented strategies are explored to build the memory representation and then adaptively fuse it into the domain-specific PLM. We demonstrate the effectiveness of G-MAP on various domains (biomedical and computer science publications, news, and reviews) and different kinds (text classification, QA, NER) of tasks, and the extensive results show that the proposed G-MAP can achieve SOTA results on all tasks.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Zhongwei Wan (39 papers)
  2. Yichun Yin (27 papers)
  3. Wei Zhang (1489 papers)
  4. Jiaxin Shi (53 papers)
  5. Lifeng Shang (90 papers)
  6. Guangyong Chen (55 papers)
  7. Xin Jiang (242 papers)
  8. Qun Liu (230 papers)
Citations (14)