Role Prompting Guided Domain Adaptation with General Capability Preserve for Large Language Models (2403.02756v1)
Abstract: The growing interest in LLMs for specialized applications has revealed a significant challenge: when tailored to specific domains, LLMs tend to experience catastrophic forgetting, compromising their general capabilities and leading to a suboptimal user experience. Additionally, crafting a versatile model for multiple domains simultaneously often results in a decline in overall performance due to confusion between domains. In response to these issues, we present the RolE Prompting Guided Multi-Domain Adaptation (REGA) strategy. This novel approach effectively manages multi-domain LLM adaptation through three key components: 1) Self-Distillation constructs and replays general-domain exemplars to alleviate catastrophic forgetting. 2) Role Prompting assigns a central prompt to the general domain and a unique role prompt to each specific domain to minimize inter-domain confusion during training. 3) Role Integration reuses and integrates a small portion of domain-specific data to the general-domain data, which are trained under the guidance of the central prompt. The central prompt is used for a streamlined inference process, removing the necessity to switch prompts for different domains. Empirical results demonstrate that REGA effectively alleviates catastrophic forgetting and inter-domain confusion. This leads to improved domain-specific performance compared to standard fine-tuned models, while still preserving robust general capabilities.
- Does an LSTM forget more than a CNN? an empirical study of catastrophic forgetting in NLP. In Proceedings of the The 17th Annual Workshop of the Australasian Language Technology Association, pages 77–86. Australasian Language Technology Association.
- Language models are few-shot learners.
- ChatLaw: Open-source legal large language model with integrated external knowledge bases.
- Specializing smaller language models towards multi-step reasoning. In Proceedings of the 40th International Conference on Machine Learning, pages 10421–10430. PMLR. ISSN: 2640-3498.
- Textbooks are all you need.
- Mix-review: Alleviate forgetting in the pretrain-finetune framework for neural language generation models.
- LoRA: Low-rank adaptation of large language models.
- Pubmedqa: A dataset for biomedical research question answering. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 2567–2577. Association for Computational Linguistics.
- Better zero-shot reasoning with role-play prompting.
- Zhizhong Li and Derek Hoiem. 2018. Learning without forgetting. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(12):2935–2947.
- Speciality vs generality: An empirical study on catastrophic forgetting in fine-tuning foundation models.
- The flan collection: Designing data and methods for effective instruction tuning.
- David Lopez-Paz and Marc’ Aurelio Ranzato. 2017. Gradient episodic memory for continual learning. In Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc.
- Bbt-fin: Comprehensive construction of chinese financial domain pre-trained language model, corpus and benchmark.
- An empirical study of catastrophic forgetting in large language models during continual fine-tuning.
- Training language models to follow instructions with human feedback.
- Medmcqa: A large-scale multi-subject multi-choice dataset for medical domain question answering. In Proceedings of the Conference on Health, Inference, and Learning, pages 248–260. PMLR.
- One model to serve all: Star topology adaptive recommender for multi-domain CTR prediction.
- Moss: Training conversational language models from synthetic data.
- LLaMA: Open and Efficient Foundation Language Models. ArXiv:2302.13971 [cs].
- Llama 2: Open foundation and fine-tuned chat models.
- HuaTuo: Tuning LLaMA model with chinese medical knowledge.
- Decoupled training: Return of frustratingly easy multi-domain learning.
- Self-instruct: Aligning language model with self generated instructions.
- Chain-of-thought prompting elicits reasoning in large language models. Version: 1.
- PMC-LLaMA: Towards building open-source language models for medicine.
- Large language models are diverse role-players for summarization evaluation.
- BloombergGPT: A large language model for finance.
- WizardLM: Empowering large language models to follow complex instructions.
- CBLUE: A Chinese biomedical language understanding evaluation benchmark. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 7888–7915, Dublin, Ireland. Association for Computational Linguistics.
- Multi-scale attentive interaction networks for chinese medical question answer selection. IEEE Access, 6:74061–74071.
- XuanYuan 2.0: A large chinese financial chat model with hundreds of billions parameters.
- Judging LLM-as-a-judge with MT-bench and chatbot arena.
- When does pretraining help? assessing self-supervised learning for law and the casehold dataset.
- Continual prompt tuning for dialog state tracking.