Forging Multiple Training Objectives for Pre-trained Language Models via Meta-Learning (2210.10293v1)

Published 19 Oct 2022 in cs.CL

Abstract: Multiple pre-training objectives fill the vacancy of the understanding capability of single-objective LLMing, which serves the ultimate purpose of pre-trained LLMs (PrLMs), generalizing well on a mass of scenarios. However, learning multiple training objectives in a single model is challenging due to the unknown relative significance as well as the potential contrariety between them. Empirical studies have shown that the current objective sampling in an ad-hoc manual setting makes the learned language representation barely converge to the desired optimum. Thus, we propose \textit{MOMETAS}, a novel adaptive sampler based on meta-learning, which learns the latent sampling pattern on arbitrary pre-training objectives. Such a design is lightweight with negligible additional training overhead. To validate our approach, we adopt five objectives and conduct continual pre-training with BERT-base and BERT-large models, where MOMETAS demonstrates universal performance gain over other rule-based sampling strategies on 14 natural language processing tasks.

Authors (7)

Hongqiu Wu (22 papers)
Ruixue Ding (9 papers)
Hai Zhao (227 papers)
Boli Chen (23 papers)
Pengjun Xie (85 papers)
Fei Huang (410 papers)
Min Zhang (632 papers)

Citations (8)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Forging Multiple Training Objectives for Pre-trained Language Models via Meta-Learning (2210.10293v1)

Summary

Related Papers