Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Task-guided Disentangled Tuning for Pretrained Language Models (2203.11431v1)

Published 22 Mar 2022 in cs.CL

Abstract: Pretrained LLMs (PLMs) trained on large-scale unlabeled corpus are typically fine-tuned on task-specific downstream datasets, which have produced state-of-the-art results on various NLP tasks. However, the data discrepancy issue in domain and scale makes fine-tuning fail to efficiently capture task-specific patterns, especially in the low data regime. To address this issue, we propose Task-guided Disentangled Tuning (TDT) for PLMs, which enhances the generalization of representations by disentangling task-relevant signals from the entangled representations. For a given task, we introduce a learnable confidence model to detect indicative guidance from context, and further propose a disentangled regularization to mitigate the over-reliance problem. Experimental results on GLUE and CLUE benchmarks show that TDT gives consistently better results than fine-tuning with different PLMs, and extensive analysis demonstrates the effectiveness and robustness of our method. Code is available at https://github.com/lemon0830/TDT.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Jiali Zeng (24 papers)
  2. Yufan Jiang (17 papers)
  3. Shuangzhi Wu (29 papers)
  4. Yongjing Yin (19 papers)
  5. Mu Li (95 papers)
Citations (3)

Summary

We haven't generated a summary for this paper yet.