Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

CSS-LM: A Contrastive Framework for Semi-supervised Fine-tuning of Pre-trained Language Models (2102.03752v3)

Published 7 Feb 2021 in cs.CL and cs.LG

Abstract: Fine-tuning pre-trained LLMs (PLMs) has demonstrated its effectiveness on various downstream NLP tasks recently. However, in many low-resource scenarios, the conventional fine-tuning strategies cannot sufficiently capture the important semantic features for downstream tasks. To address this issue, we introduce a novel framework (named "CSS-LM") to improve the fine-tuning phase of PLMs via contrastive semi-supervised learning. Specifically, given a specific task, we retrieve positive and negative instances from large-scale unlabeled corpora according to their domain-level and class-level semantic relatedness to the task. We then perform contrastive semi-supervised learning on both the retrieved unlabeled and original labeled instances to help PLMs capture crucial task-related semantic features. The experimental results show that CSS-LM achieves better results than the conventional fine-tuning strategy on a series of downstream tasks with few-shot settings, and outperforms the latest supervised contrastive fine-tuning strategies. Our datasets and source code will be available to provide more details.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Yusheng Su (21 papers)
  2. Xu Han (270 papers)
  3. Yankai Lin (125 papers)
  4. Zhengyan Zhang (46 papers)
  5. Zhiyuan Liu (433 papers)
  6. Peng Li (390 papers)
  7. Jie Zhou (687 papers)
  8. Maosong Sun (337 papers)
Citations (9)