Papers

Topics

Authors

Recent

View all

Assistant

AI Research Assistant

Well-researched responses based on relevant abstracts and paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses.

Gemini 2.5 Flash

Gemini 2.5 Flash 173 tok/s

Gemini 2.5 Pro 54 tok/s Pro

GPT-5 Medium 36 tok/s Pro

GPT-5 High 35 tok/s Pro

GPT-4o 110 tok/s Pro

Kimi K2 221 tok/s Pro

GPT OSS 120B 444 tok/s Pro

Claude Sonnet 4.5 37 tok/s Pro

2000 character limit reached

Adaptive Rank Allocation for Federated Parameter-Efficient Fine-Tuning of Language Models (2501.14406v3)

Published 24 Jan 2025 in cs.DC, cs.AI, cs.LG, and cs.NI

Abstract: Pre-trained LLMs (PLMs) have demonstrated their superiority and versatility in modern NLP, effectively adapting to various downstream tasks through further fine-tuning. Federated Parameter-Efficient Fine-Tuning (FedPEFT) has emerged as a promising solution to address privacy and efficiency challenges in distributed training for PLMs on resource-constrained local devices. However, our measurements reveal two key limitations of FedPEFT: heterogeneous data across devices exacerbates performance degradation of low-rank adaptation, and a fixed parameter configuration results in communication inefficiency. To overcome these limitations, we propose FedARA, a novel Adaptive Rank Allocation framework for federated parameter-efficient fine-tuning of LLMs. Specifically, FedARA employs truncated Singular Value Decomposition (SVD) adaptation to enhance similar feature representation across clients, significantly mitigating the adverse effects of data heterogeneity. Subsequently, it utilizes dynamic rank allocation to progressively identify critical ranks, effectively improving communication efficiency. Lastly, it leverages rank-based module pruning to automatically remove inactive modules, steadily reducing local computational cost and memory usage in each federated learning round. Extensive experiments show that FedARA consistently outperforms baselines by an average of 6.95% to 8.49% across various datasets and models under heterogeneous data while significantly improving communication efficiency by 2.40$ \times$. Moreover, experiments on various edge devices demonstrate substantial decreases in total training time and energy consumption by up to 48.90% and 46.95%, respectively.