Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

BLoB: Bayesian Low-Rank Adaptation by Backpropagation for Large Language Models (2406.11675v4)

Published 17 Jun 2024 in cs.LG, cs.AI, cs.CL, and stat.ML

Abstract: LLMs often suffer from overconfidence during inference, particularly when adapted to downstream domain-specific tasks with limited data. Previous work addresses this issue by employing approximate Bayesian estimation after the LLMs are trained, enabling them to quantify uncertainty. However, such post-training approaches' performance is severely limited by the parameters learned during training. In this paper, we go beyond post-training Bayesianization and propose Bayesian Low-Rank Adaptation by Backpropagation (BLoB), an algorithm that continuously and jointly adjusts both the mean and covariance of LLM parameters throughout the whole fine-tuning process. Our empirical results verify the effectiveness of BLoB in terms of generalization and uncertainty estimation, when evaluated on both in-distribution and out-of-distribution data.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Yibin Wang (26 papers)
  2. Haizhou Shi (25 papers)
  3. Ligong Han (39 papers)
  4. Dimitris Metaxas (85 papers)
  5. Hao Wang (1120 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.