Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

LoRA Training in the NTK Regime has No Spurious Local Minima (2402.11867v3)

Published 19 Feb 2024 in cs.LG and math.OC

Abstract: Low-rank adaptation (LoRA) has become the standard approach for parameter-efficient fine-tuning of LLMs (LLM), but our theoretical understanding of LoRA has been limited. In this work, we theoretically analyze LoRA fine-tuning in the neural tangent kernel (NTK) regime with $N$ data points, showing: (i) full fine-tuning (without LoRA) admits a low-rank solution of rank $r\lesssim \sqrt{N}$; (ii) using LoRA with rank $r\gtrsim \sqrt{N}$ eliminates spurious local minima, allowing gradient descent to find the low-rank solutions; (iii) the low-rank solution found using LoRA generalizes well.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Uijeong Jang (2 papers)
  2. Jason D. Lee (151 papers)
  3. Ernest K. Ryu (54 papers)
Citations (9)