LoRA Training in the NTK Regime has No Spurious Local Minima (2402.11867v3)

Published 19 Feb 2024 in cs.LG and math.OC

Abstract: Low-rank adaptation (LoRA) has become the standard approach for parameter-efficient fine-tuning of LLMs (LLM), but our theoretical understanding of LoRA has been limited. In this work, we theoretically analyze LoRA fine-tuning in the neural tangent kernel (NTK) regime with $N$ data points, showing: (i) full fine-tuning (without LoRA) admits a low-rank solution of rank $r\lesssim \sqrt{N}$; (ii) using LoRA with rank $r\gtrsim \sqrt{N}$ eliminates spurious local minima, allowing gradient descent to find the low-rank solutions; (iii) the low-rank solution found using LoRA generalizes well.

PDF Abstract

Summarize PDF Markdown Bookmark Chat (Pro)

Authors (3)

Uijeong Jang (2 papers)
Jason D. Lee (151 papers)
Ernest K. Ryu (54 papers)

Citations (9)

View on Semantic Scholar

GitHub

GitHub - UijeongJang/LoRA-NTK: This is the repository for the paper "LoRA Training in the NTK regime has No Spurious Local Minima". (8 stars)

LoRA Training in the NTK Regime has No Spurious Local Minima (2402.11867v3)

Related Papers

GitHub