Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
134 tokens/sec
GPT-4o
10 tokens/sec
Gemini 2.5 Pro Pro
47 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

CopRA: A Progressive LoRA Training Strategy (2410.22911v1)

Published 30 Oct 2024 in cs.LG

Abstract: Low-Rank Adaptation (LoRA) is a parameter-efficient technique for rapidly fine-tuning foundation models. In standard LoRA training dynamics, models tend to quickly converge to a local optimum near the initialization. However, this local optimum may not be ideal for out-of-distribution data or tasks such as merging and pruning. In this work, we propose a novel progressive training strategy for LoRA with random layer dropping. This strategy also optimizes the Shapley value of LoRA parameters in each layer, treating each layer as a player in a cooperative game. We refer to this method as Cooperative LoRA (CopRA). Our experimental results demonstrate that parameters trained with CopRA exhibit linear mode connectivity, which enables efficient model merging. This also paves the way for federated learning and multi-task learning via LoRA merging. Additionally, by optimizing the Shapley value, CopRA shows superior performance in pruning tasks.

Summary

We haven't generated a summary for this paper yet.