Parameter Efficient Fine-Tuning (PEFT)

Updated 8 July 2025

PEFT is a set of techniques that update only a small fraction of model parameters, efficiently adapting pre-trained models to new tasks.
It encompasses additive, selective, and reparameterized approaches that minimize resource use while preserving performance.
Applications in NLP, vision, and multimodal tasks demonstrate its practical benefits in resource-constrained and scalable settings.

Parameter Efficient Fine-Tuning (PEFT) is a class of methodologies for adapting large pre-trained models to downstream tasks by updating only a small fraction of model parameters. PEFT aims to minimize computational and memory demands compared to full-model fine-tuning, while maintaining or approaching the performance of full adaptation. PEFT is widely applied to LLMs, encoder-decoder architectures, and models in computer vision and other domains, addressing the resource accessibility challenges posed by recent scaling trends in deep learning.

1. Core Principles and Methodological Taxonomy

PEFT frameworks are grounded in the principle that most of the knowledge acquired during pre-training is generalizable, and efficient transfer to new tasks is possible by adjusting only select parts of the model. Four principal methodological categories structure the landscape of PEFT:

Additive PEFT: Small, trainable modules—called adapters—are inserted into Transformer blocks or other layers. These modules are generally bottleneck neural networks whose output is added to the original layer activations. Adapter-based PEFT includes both serial and parallel configurations and can incorporate light-weight modules after feedforward or attention sublayers. For example, an adapter may operate as:

$\text{Adapter}(x) = W_\text{up} \cdot \sigma(W_\text{down} \cdot x) + x$

where $W_\text{down}$ reduces the dimensionality, $W_\text{up}$ projects it back, and $\sigma$ is a nonlinearity (Han et al., 21 Mar 2024, Prottasha et al., 19 Apr 2025).

Selective PEFT: Rather than introducing new parameters, selective approaches update only a subset of existing ones. The choice of which parameters to update may rely on pre-determined rules (e.g., biases only, as in BitFit) or on data-driven criteria such as magnitude, gradient sensitivity, or Fisher information. Updates are masked:

$\theta_i' = \theta_i - \eta \cdot m_i \cdot \frac{\partial L}{\partial \theta_i}$

where $m_i \in \{0,1\}$ indicates whether parameter $i$ is trainable (Han et al., 21 Mar 2024, Prottasha et al., 19 Apr 2025). Automatic subset selection methods (e.g., DiffPruning, FISH Mask, AdaPEFT) leverage second-order statistics such as the Hessian to maximize loss reduction per parameter under a fixed budget (Xu et al., 18 May 2025, Liao et al., 2023).

Reparameterized PEFT: Weight matrices are augmented through low-rank or other decompositions. The most prominent variant is LoRA (Low-Rank Adaptation), which expresses the parameter update as:

$h_\text{out} = W_0 h_\text{in} + \alpha \cdot W_\text{up} W_\text{down} h_\text{in}$

with $W_\text{down} \in \mathbb{R}^{r \times k}$ , $W_\text{up} \in \mathbb{R}^{d \times r}$ , $r \ll d, k$ , and scaling $\alpha$ (Han et al., 21 Mar 2024, Pu et al., 2023, Prottasha et al., 19 Apr 2025). Variants include dynamic rank adjustment and spectral or nonlinear low-rank updates.

Hybrid and Unified Frameworks: Modern approaches often combine several mechanisms, e.g., soft prompts plus adapters, low-rank updates plus selective masking, or mixture-of-experts with PEFT modules (e.g., PERFT for MoE models) to leverage complementary strengths for improved parameter efficiency and task coverage (Liu et al., 12 Nov 2024, Prottasha et al., 19 Apr 2025).

2. Benchmarking, Performance, and Efficiency

Empirical assessment of PEFT methods has focused on their ability to match the performance of full fine-tuning (FFT) while providing savings in resource consumption:

Benchmarking on LLMs: Uniform evaluations across FLAN-T5 tasks—classification (AG News, CoLA) and generation (E2E, SAMSum)—demonstrate that LoRA and BitFit close the gap with FFT as the amount of training data increases. In low-resource settings, FFT is typically faster and more accurate, but PEFT methods become competitive or even more parameter-efficient with sufficient data (Pu et al., 2023).
Convergence Analysis: PEFT methods generally require more epochs to converge under sparse data regimes. Full fine-tuning converges up to 73–87% faster in low-data settings; however, with larger datasets, the stability and final accuracy of PEFT methods approach parity with FFT (Pu et al., 2023).
Parameter and Memory Efficiency: PEFT methods such as RED (Representation Editing) and LoReFT (Representation Fine-Tuning) update as little as 0.025–0.1% of the parameters while maintaining state-of-the-art results on reasoning and structured prediction tasks (Balne et al., 21 Apr 2024, Wu et al., 23 Feb 2024). Adapter-based methods with layer selection can reduce trainable parameters by 50% or more while preserving performance (Pu et al., 2023).
Practical Resource Implications: PEFT drastically reduces memory footprint and compute, enabling large models to be adapted on modest hardware or in settings with many concurrent downstream tasks (e.g., personalized models per user/profile) (Kwak et al., 29 Jan 2024, Pu et al., 2023).

3. Applications Across Modalities and Domains

The versatility of PEFT extends across multiple domains:

Domain	Key Techniques	Representative Tasks / Models
NLP / LLMs	LoRA, Adapters, BitFit, Selective Masking	Question answering, summarization, reasoning, instruction tuning, translation (Han et al., 21 Mar 2024)
Vision	Visual Adapters, Spectral/Graph Adapters	Image classification, segmentation (SAM-COBOT), object detection, 3D point cloud learning (Liang et al., 10 Oct 2024, Peng et al., 2023)
Multimodal	Prompt Tuning, LoRA, MoE-PEFT	Vision-LLMs, generative models (e.g., CLIP, DALL-E, LLaVA) (Zhang et al., 23 Jan 2025, Liu et al., 12 Nov 2024)
Generative/Scientific	LoRA-Conv, ReFT	Medical imaging, seismic inversion, protein folding, mathematical reasoning (Ghosal et al., 27 Dec 2024, Balne et al., 21 Apr 2024)

Significant empirical findings include out-of-domain generalization benefits in low-data translation (Su et al., 5 Apr 2024), robust multilingual transfer with careful LoRA rank/quantization settings (Aggarwal et al., 15 Jan 2024), and improved scalability for many-profile adaptation with methods like X-PEFT (Kwak et al., 29 Jan 2024).

4. Design Considerations, Module Selection, and Search

PEFT design space encompasses choices in module type, placement, and capacity:

Layer and Module Selection: Strategic tuning of only later layers or attention modules can maintain or improve task performance while halving the parameter count, pointing to the centrality of task-specific representations residing in upper layers (Pu et al., 2023).
Automated Configuration Search: Search over possible PEFT module types and layer placements (architecture search) can be computationally exorbitant. PrunePEFT reframes the search as a pruning task, iteratively removing redundant modules via a hybrid criterion that fuses sensitivity measures (activation, gradient, Taylor-based) (Yu et al., 9 Jun 2025). This approach yields near-optimal subnetworks at a fraction of the resource cost of traditional search.
Budget-Guided and Adaptive Methods: Techniques like BIPEFT and AdaPEFT combine parameter budgets with automated search or Hessian-informed parameter selection, recasting module selection as a knapsack/Pareto optimization problem and adjusting active parameter sets for maximum influence under resource constraints (Chang et al., 4 Oct 2024, Xu et al., 18 May 2025).

5. Theoretical Foundations and Extensions

Recent work has unified the PEFT landscape through the lens of matrix decomposition and subspace tuning:

Subspace Tuning Perspective: All PEFT methods can be interpreted as searching for transformations $\phi(W) = g(f(W))$ that reconstruct or augment the subspace spanned by a pre-trained weight matrix $W$ (with SVD $W = U\Sigma V^\top$ ). Reconstruction-based PEFT (e.g., (IA) $^3$ , SSB) adjusts singular spaces, while extension-based methods (e.g., LoRA, MPC frameworks) add new low-rank bases (Si et al., 7 Jul 2024).
Matrix Pattern Constraint Framework: The imposition or relaxation of constraints (e.g., semi-orthogonality in low-rank factors) is shown to critically affect expressivity and learning dynamics. Properly balancing such constraints (e.g., via MPC variants) enables PEFT schemes to approach full-tuning performance while preserving parameter efficiency (Si et al., 7 Jul 2024).
Domain-Specific Innovations: For example, PointGST introduces spectral domain adapters (graph Fourier basis) to handle geometric structure in point cloud data—demonstrating that domain-specific PEFT module design can outperform traditional full fine-tuning (Liang et al., 10 Oct 2024).

6. Practical Impact, Limitations, and Future Directions

PEFT's reduced resource requirements have democratized the deployment and personalization of large models across domains, but several nontrivial considerations remain:

Hyperparameter Sensitivity: Methods such as LoRA and adapters require careful tuning of rank or bottleneck size and are more sensitive in low-data regimes (He, 25 Nov 2024, Han et al., 21 Mar 2024). Instability and suboptimal generalization can arise if these parameters and training conditions (learning rate, diversity of tasks) are not properly managed.
Generalization and Transfer: In some settings, PEFT methods excel at out-of-domain generalization or transfer learning (e.g., seismic FWI, low-resource NMT), while in others, especially tasks demanding complex reasoning, coding, or long-form generation, full fine-tuning remains superior. LoRA often outperforms adapter-based methods on open instruction tasks, but may require more data for equivalent generalization (He, 25 Nov 2024).
Scalability and Interpretability: Challenges include reliable scaling of PEFT to ultra-large models, transparent understanding of which parameters carry task-specific information, and ensuring that PEFT modules do not undercut the underlying model’s general capabilities (Prottasha et al., 19 Apr 2025, Zhang et al., 23 Jan 2025).
Unified Benchmarking and Theoretical Grounding: Calls for unified, standardized evaluation and deeper theoretical insights into why and when particular PEFT methods succeed are frequent (Han et al., 21 Mar 2024, Zhang et al., 23 Jan 2025).
Emerging Areas: Research trajectories highlighted for future work include federated and privacy-preserving PEFT, federated/multi-profile adaptation (X-PEFT), dynamic and automated selection, continual learning, domain-specific module design, and integration with model compression and quantization methods (Chang et al., 4 Oct 2024, Kwak et al., 29 Jan 2024, Yu et al., 9 Jun 2025).

7. Representative Mathematical Formulations

PEFT methods are typically formalized via update rules for parameters (selective or additive) and efficiency metrics. Key representative formulas include:

Selective update/masking: $\theta_i' = \theta_i - \eta \cdot m_i \cdot \frac{\partial L}{\partial \theta_i}$
Low-Rank adaptation: $\Delta W \approx A B$ with $A \in \mathbb{R}^{d \times r}, B \in \mathbb{R}^{r \times d}$
Adapter bottleneck: $\text{Adapter}(x) = W_\text{up} \cdot \sigma(W_\text{down} \cdot x) + x$
Subspace tuning: $W = U \Sigma V^\top,\quad \phi(W) = g(f(W))$
Performance efficiency:

$\text{Performance per Parameter} = \frac{\text{Performance Metric}}{\text{Number of Tunable Parameters}}$

$\text{Performance per Time} = \frac{\text{Performance Metric}}{\text{Total Run Time in minutes}}$

(Pu et al., 2023)

These formulations reflect the central objective of maximizing adaptation gains under strict parameter and compute constraints.

Parameter Efficient Fine-Tuning has evolved into a versatile and theoretically principled field, bridging efficient, scalable model adaptation with practical performance across an array of domains, underpinned by a growing ecosystem of modular, theoretically motivated methodologies and empirical best practices (Pu et al., 2023, Zhang et al., 23 Jan 2025, Xu et al., 18 May 2025, He, 25 Nov 2024, Prottasha et al., 19 Apr 2025).