Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Robust Transfer Learning with Pretrained Language Models through Adapters (2108.02340v1)

Published 5 Aug 2021 in cs.CL

Abstract: Transfer learning with large pretrained transformer-based LLMs like BERT has become a dominating approach for most NLP tasks. Simply fine-tuning those LLMs on downstream tasks or combining it with task-specific pretraining is often not robust. In particular, the performance considerably varies as the random seed changes or the number of pretraining and/or fine-tuning iterations varies, and the fine-tuned model is vulnerable to adversarial attack. We propose a simple yet effective adapter-based approach to mitigate these issues. Specifically, we insert small bottleneck layers (i.e., adapter) within each layer of a pretrained model, then fix the pretrained layers and train the adapter layers on the downstream task data, with (1) task-specific unsupervised pretraining and then (2) task-specific supervised training (e.g., classification, sequence labeling). Our experiments demonstrate that such a training scheme leads to improved stability and adversarial robustness in transfer learning to various downstream tasks.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Wenjuan Han (36 papers)
  2. Bo Pang (77 papers)
  3. Yingnian Wu (8 papers)
Citations (49)

Summary

We haven't generated a summary for this paper yet.