Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Improving the Reusability of Pre-trained Language Models in Real-world Applications (2307.10457v3)

Published 19 Jul 2023 in cs.CL

Abstract: The reusability of state-of-the-art Pre-trained LLMs (PLMs) is often limited by their generalization problem, where their performance drastically decreases when evaluated on examples that differ from the training dataset, known as Out-of-Distribution (OOD)/unseen examples. This limitation arises from PLMs' reliance on spurious correlations, which work well for frequent example types but not for general examples. To address this issue, we propose a training approach called Mask-tuning, which integrates Masked LLMing (MLM) training objectives into the fine-tuning process to enhance PLMs' generalization. Comprehensive experiments demonstrate that Mask-tuning surpasses current state-of-the-art techniques and enhances PLMs' generalization on OOD datasets while improving their performance on in-distribution datasets. The findings suggest that Mask-tuning improves the reusability of PLMs on unseen data, making them more practical and effective for real-world applications.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Somayeh Ghanbarzadeh (2 papers)
  2. Hamid Palangi (52 papers)
  3. Yan Huang (180 papers)
  4. Radames Cruz Moreno (4 papers)
  5. Hamed Khanpour (6 papers)

Summary

We haven't generated a summary for this paper yet.