Scaling LoRA-GA to larger pretrained models
Determine whether Low-Rank Adaptation with Gradient Approximation (LoRA-GA) maintains its convergence speed and performance benefits when validated on substantially larger pretrained models such as Llama 2-70B, establishing its effectiveness at scale relative to full fine-tuning.
References
However, due to computational resource constraints, we have not validated LoRA-GA on larger pre-trained models (e.g., Llama 2-70B).
— LoRA-GA: Low-Rank Adaptation with Gradient Approximation
(2407.05000 - Wang et al., 6 Jul 2024) in Section: Limitations