Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Revisit Parameter-Efficient Transfer Learning: A Two-Stage Paradigm (2303.07910v1)

Published 14 Mar 2023 in cs.CV

Abstract: Parameter-Efficient Transfer Learning (PETL) aims at efficiently adapting large models pre-trained on massive data to downstream tasks with limited task-specific data. In view of the practicality of PETL, previous works focus on tuning a small set of parameters for each downstream task in an end-to-end manner while rarely considering the task distribution shift issue between the pre-training task and the downstream task. This paper proposes a novel two-stage paradigm, where the pre-trained model is first aligned to the target distribution. Then the task-relevant information is leveraged for effective adaptation. Specifically, the first stage narrows the task distribution shift by tuning the scale and shift in the LayerNorm layers. In the second stage, to efficiently learn the task-relevant information, we propose a Taylor expansion-based importance score to identify task-relevant channels for the downstream task and then only tune such a small portion of channels, making the adaptation to be parameter-efficient. Overall, we present a promising new direction for PETL, and the proposed paradigm achieves state-of-the-art performance on the average accuracy of 19 downstream tasks.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Hengyuan Zhao (10 papers)
  2. Hao Luo (114 papers)
  3. Yuyang Zhao (24 papers)
  4. Pichao Wang (65 papers)
  5. Fan Wang (313 papers)
  6. Mike Zheng Shou (165 papers)
Citations (4)

Summary

We haven't generated a summary for this paper yet.