Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Fine-Tuning Pre-trained Language Model with Weak Supervision: A Contrastive-Regularized Self-Training Approach (2010.07835v3)

Published 15 Oct 2020 in cs.CL and cs.LG

Abstract: Fine-tuned pre-trained LLMs (LMs) have achieved enormous success in many NLP tasks, but they still require excessive labeled data in the fine-tuning stage. We study the problem of fine-tuning pre-trained LMs using only weak supervision, without any labeled data. This problem is challenging because the high capacity of LMs makes them prone to overfitting the noisy labels generated by weak supervision. To address this problem, we develop a contrastive self-training framework, COSINE, to enable fine-tuning LMs with weak supervision. Underpinned by contrastive regularization and confidence-based reweighting, this contrastive self-training framework can gradually improve model fitting while effectively suppressing error propagation. Experiments on sequence, token, and sentence pair classification tasks show that our model outperforms the strongest baseline by large margins on 7 benchmarks in 6 tasks, and achieves competitive performance with fully-supervised fine-tuning methods.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Yue Yu (343 papers)
  2. Simiao Zuo (25 papers)
  3. Haoming Jiang (52 papers)
  4. Wendi Ren (3 papers)
  5. Tuo Zhao (131 papers)
  6. Chao Zhang (909 papers)
Citations (122)

Summary

We haven't generated a summary for this paper yet.