CSS-LM: A Contrastive Framework for Semi-supervised Fine-tuning of Pre-trained Language Models (2102.03752v3)

Published 7 Feb 2021 in cs.CL and cs.LG

Abstract: Fine-tuning pre-trained LLMs (PLMs) has demonstrated its effectiveness on various downstream NLP tasks recently. However, in many low-resource scenarios, the conventional fine-tuning strategies cannot sufficiently capture the important semantic features for downstream tasks. To address this issue, we introduce a novel framework (named "CSS-LM") to improve the fine-tuning phase of PLMs via contrastive semi-supervised learning. Specifically, given a specific task, we retrieve positive and negative instances from large-scale unlabeled corpora according to their domain-level and class-level semantic relatedness to the task. We then perform contrastive semi-supervised learning on both the retrieved unlabeled and original labeled instances to help PLMs capture crucial task-related semantic features. The experimental results show that CSS-LM achieves better results than the conventional fine-tuning strategy on a series of downstream tasks with few-shot settings, and outperforms the latest supervised contrastive fine-tuning strategies. Our datasets and source code will be available to provide more details.

PDF Abstract

Summarize Bookmark Chat (Pro)

Authors (8)

Yusheng Su (21 papers)
Xu Han (270 papers)
Yankai Lin (125 papers)
Zhengyan Zhang (46 papers)
Zhiyuan Liu (433 papers)
Peng Li (390 papers)
Jie Zhou (687 papers)
Maosong Sun (337 papers)

Citations (9)

View on Semantic Scholar

CSS-LM: A Contrastive Framework for Semi-supervised Fine-tuning of Pre-trained Language Models (2102.03752v3)

Related Papers