Benchmark-generalization of LoRA-GA across datasets
Ascertain whether the performance improvements and convergence behavior of LoRA-GA observed on MTBench, GSM8K, and Human-eval are consistent across a broader set of evaluation datasets and benchmarks, thereby determining the generality of its advantages.
References
Another limitation pertains to our evaluation scope. While we provide evaluations on MTBench, GSM8K, and Human-eval, we did not assess our method on other datasets. Consequently, we cannot fully guarantee that our findings are universally consistent across all benchmarks.
— LoRA-GA: Low-Rank Adaptation with Gradient Approximation
(2407.05000 - Wang et al., 6 Jul 2024) in Section: Limitations