AltLoRA: Towards Better Gradient Approximation in Low-Rank Adaptation with Alternating Projections (2505.12455v1)

Published 18 May 2025 in cs.LG

Abstract: Low-Rank Adaptation (LoRA) has emerged as an effective technique for reducing memory overhead in fine-tuning LLMs. However, it often suffers from sub-optimal performance compared with full fine-tuning since the update is constrained in the low-rank space. Recent variants such as LoRA-Pro attempt to mitigate this by adjusting the gradients of the low-rank matrices to approximate the full gradient. However, LoRA-Pro's solution is not unique, and different solutions can lead to significantly varying performance in ablation studies. Besides, to incorporate momentum or adaptive optimization design, approaches like LoRA-Pro must first compute the equivalent gradient, causing a higher memory cost close to full fine-tuning. A key challenge remains in integrating momentum properly into the low-rank space with lower memory cost. In this work, we propose AltLoRA, an alternating projection method that avoids the difficulties in gradient approximation brought by the joint update design, meanwhile integrating momentum without higher memory complexity. Our theoretical analysis provides convergence guarantees and further shows that AltLoRA enables stable feature learning and robustness to transformation invariance. Extensive experiments across multiple tasks demonstrate that AltLoRA outperforms LoRA and its variants, narrowing the gap toward full fine-tuning while preserving superior memory efficiency.

Authors (4)

Xin Yu (192 papers)
Yujia Wang (29 papers)
Jinghui Chen (50 papers)
Lingzhou Xue (45 papers)

Summary

We haven't generated a summary for this paper yet.

Summarize Now

AltLoRA: Towards Better Gradient Approximation in Low-Rank Adaptation with Alternating Projections (2505.12455v1)

Summary

Related Papers