Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

SPAFIT: Stratified Progressive Adaptation Fine-tuning for Pre-trained Large Language Models (2405.00201v1)

Published 30 Apr 2024 in cs.CL and cs.AI

Abstract: Full fine-tuning is a popular approach to adapt Transformer-based pre-trained LLMs to a specific downstream task. However, the substantial requirements for computational power and storage have discouraged its widespread use. Moreover, increasing evidence of catastrophic forgetting and overparameterization in the Transformer architecture has motivated researchers to seek more efficient fine-tuning (PEFT) methods. Commonly known parameter-efficient fine-tuning methods like LoRA and BitFit are typically applied across all layers of the model. We propose a PEFT method, called Stratified Progressive Adaptation Fine-tuning (SPAFIT), based on the localization of different types of linguistic knowledge to specific layers of the model. Our experiments, conducted on nine tasks from the GLUE benchmark, show that our proposed SPAFIT method outperforms other PEFT methods while fine-tuning only a fraction of the parameters adjusted by other methods.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Samir Arora (1 paper)
  2. Liangliang Wang (22 papers)
X Twitter Logo Streamline Icon: https://streamlinehq.com