Locate the performance–precision trade-off for QLoRA tuning
Determine where the performance–precision trade-off lies for QLoRA finetuning, which backpropagates through 4-bit quantized pretrained language model weights into Low-Rank Adapters, by identifying the precision levels at which QLoRA ceases to match full 16-bit finetuning performance.
References
Since we did not observe performance degradation compared to full-finetuning in our experiments with 4-bit finetuning, this raises the question of where the performance-precision trade-off exactly lies for QLoRA tuning, which we leave to future work to explore.
— QLoRA: Efficient Finetuning of Quantized LLMs
(2305.14314 - Dettmers et al., 2023) in Summary, Section "QLoRA vs. Standard Finetuning"