Improving the Reusability of Pre-trained Language Models in Real-world Applications (2307.10457v3)

Published 19 Jul 2023 in cs.CL

Abstract: The reusability of state-of-the-art Pre-trained LLMs (PLMs) is often limited by their generalization problem, where their performance drastically decreases when evaluated on examples that differ from the training dataset, known as Out-of-Distribution (OOD)/unseen examples. This limitation arises from PLMs' reliance on spurious correlations, which work well for frequent example types but not for general examples. To address this issue, we propose a training approach called Mask-tuning, which integrates Masked LLMing (MLM) training objectives into the fine-tuning process to enhance PLMs' generalization. Comprehensive experiments demonstrate that Mask-tuning surpasses current state-of-the-art techniques and enhances PLMs' generalization on OOD datasets while improving their performance on in-distribution datasets. The findings suggest that Mask-tuning improves the reusability of PLMs on unseen data, making them more practical and effective for real-world applications.

PDF Abstract

Summarize Bookmark Chat (Pro)

Authors (5)

Somayeh Ghanbarzadeh (2 papers)
Hamid Palangi (52 papers)
Yan Huang (180 papers)
Radames Cruz Moreno (4 papers)
Hamed Khanpour (6 papers)

Improving the Reusability of Pre-trained Language Models in Real-world Applications (2307.10457v3)

Related Papers