Towards Robustness and Diversity: Continual Learning in Dialog Generation with Text-Mixup and Batch Nuclear-Norm Maximization (2403.10894v1)

Published 16 Mar 2024 in cs.CL

Abstract: In our dynamic world where data arrives in a continuous stream, continual learning enables us to incrementally add new tasks/domains without the need to retrain from scratch. A major challenge in continual learning of LLM is catastrophic forgetting, the tendency of models to forget knowledge from previously trained tasks/domains when training on new ones. This paper studies dialog generation under the continual learning setting. We propose a novel method that 1) uses \textit{Text-Mixup} as data augmentation to avoid model overfitting on replay memory and 2) leverages Batch-Nuclear Norm Maximization (BNNM) to alleviate the problem of mode collapse. Experiments on a $37$-domain task-oriented dialog dataset and DailyDialog (a $10$-domain chitchat dataset) demonstrate that our proposed approach outperforms the state-of-the-art in continual learning.

PDF Abstract

Summarize Bookmark Chat (Pro)

Authors (7)

Zihan Wang (181 papers)
Jiayu Xiao (7 papers)
Mengxiang Li (3 papers)
Zhongjiang He (11 papers)
Yongxiang Li (22 papers)
Chao Wang (555 papers)
Shuangyong Song (18 papers)

Towards Robustness and Diversity: Continual Learning in Dialog Generation with Text-Mixup and Batch Nuclear-Norm Maximization (2403.10894v1)

Related Papers