Papers
Topics
Authors
Recent
2000 character limit reached

Repulsive Bayesian Prompt Learning (ReBaPL)

Updated 29 November 2025
  • The paper introduces repulsive Bayesian techniques that integrate diversity-inducing forces into prompt learning to overcome mode collapse and enhance generalization.
  • ReBaPL leverages particle-based methods like SVGD and SGHMC to approximate a multimodal posterior over soft prompt parameters for improved transfer performance.
  • Empirical evaluations show that the repulsive mechanism maintains ensemble diversity, leading to better out-of-distribution generalization and sample efficiency.

Repulsive Bayesian Prompt Learning (ReBaPL) refers to a family of Bayesian prompt learning techniques that incorporate explicit diversity-inducing (repulsive) mechanisms into the inference and optimization of soft prompt parameters for large-scale pre-trained models. ReBaPL algorithms seek to approximate the full, potentially multimodal posterior over prompt parameters by maintaining an ensemble of particles—each representing plausible prompt hypotheses—while enforcing diversity via repulsive forces among the particles. This methodology addresses the limitations of both classical maximum likelihood prompt tuning and conventional Bayesian prompt learning, particularly regarding mode collapse, overfitting, and generalization under distribution shift (Lee et al., 13 Feb 2024, Bendou et al., 21 Nov 2025).

1. Bayesian Formulation of Prompt Learning

Let θ\theta denote a prompt parameter vector (often a soft prompt, i.e., a set of continuous embeddings) and D={(xi,yi)}i=1nD=\{(x_i, y_i)\}_{i=1}^n a training dataset. In maximum likelihood prompt learning, θ\theta is optimized as

θ=argminθ[1ni=1nlogp(yixi,θ)],\theta^* = \arg\min_\theta\left[ - \frac{1}{n} \sum_{i=1}^n \log p(y_i|x_i, \theta) \right],

where p(yx,θ)p(y|x,\theta) is the predictive model output (speech, vision–language, or text, as appropriate) with frozen model weights and tunable prompt. However, this approach can result in overfitting and poor out-of-distribution (OOD) generalization, particularly in few-shot settings.

Bayesian prompt learning addresses this by placing a prior p(θ)p(\theta) on the prompt and inferring the posterior

p(θD)p(Dθ)p(θ),p(\theta|D) \propto p(D|\theta) p(\theta),

with p(Dθ)p(D|\theta) typically given by the model log-likelihood. In multi-task transfer, e.g., Bayesian Multi-Task Prompt Tuning (BMTPT), p(θkDk)p(\theta|\bigcup_k D^k) is constructed jointly or from independent task datasets, making use of weakly informative (often uniform or very broad Gaussian) priors (Lee et al., 13 Feb 2024).

2. Repulsive Mechanisms in Bayesian Prompt Inference

Standard variational inference or unimodal Gaussian approximations tend to capture only a single mode of p(θD)p(\theta|D), potentially missing diverse and functionally distinct solutions. ReBaPL achieves a richer posterior approximation through interacting particle-based methods, explicitly introducing repulsive terms that prevent mode collapse.

2.1 SVGD-based Repulsion (NLP context)

In BMTPT, Stein Variational Gradient Descent (SVGD) is used to approximate p(θD)p(\theta|D) by evolving a collection of MM prompt particles {φi}i=1M\{\varphi_i\}_{i=1}^M under the update:

φiφi+ηϕ(φi),\varphi_i \leftarrow \varphi_i + \eta \phi^*(\varphi_i),

with

ϕ()=1Mj=1M[k(φj,)φjlogp(φjD)+φjk(φj,)],\phi^*(\cdot) = \frac{1}{M} \sum_{j=1}^{M} \left[ k(\varphi_j, \cdot)\, \nabla_{\varphi_j} \log p(\varphi_j|D) + \nabla_{\varphi_j} k(\varphi_j, \cdot) \right],

where k(,)k( \cdot, \cdot ) is an RBF kernel. The first term encourages ascent along the posterior gradient landscape, while the second “repulsive” term φjk(φj,)\nabla_{\varphi_j} k(\varphi_j, \cdot) pushes particles apart. The repulsion directly mitigates particle collapse and enables exploration of multiple posterior modes, which empirical ablations show to be essential for transfer performance (Lee et al., 13 Feb 2024).

2.2 SGHMC with Representation-Space Repulsion (Vision–Language context)

ReBaPL for vision-language prompt learning (Bendou et al., 21 Nov 2025) employs KK parallel MCMC chains {θk}k=1K\{\theta_k\}_{k=1}^K updated via stochastic gradient Hamiltonian Monte Carlo (SGHMC), with a cyclical step size for alternating exploration and exploitation. A novel repulsive force, derived from a potential function over probability metrics in the model’s representation space, is introduced:

Urep(θ1,,θK)=k=1KU(θk)+ξk=1KkVrep(θk,θ)U_{\mathrm{rep}}\left(\theta_1, \ldots, \theta_K\right) = \sum_{k=1}^K U(\theta_k) + \xi \sum_{k=1}^K \sum_{\ell \ne k} V_{\mathrm{rep}}(\theta_k, \theta_\ell)

where U(θ)=logp(Dθ)logp(θ)U(\theta) = -\log p(D|\theta) - \log p(\theta) and the repulsive potential VrepV_{\mathrm{rep}} is constructed over distances between distributions P(U)\mathcal{P}(\mathcal{U}) induced by prompts, measured via Maximum Mean Discrepancy (MMD) or $2$-Wasserstein. Thus, repulsion operates on induced representations, directly encouraging functional diversity.

3. Aggregation and Adaptation to Target Tasks

After sampling diverse prompt parameters, ReBaPL aggregates posterior samples for adaptation:

  • In BMTPT, prompt particles are aggregated via averaging and a Gaussian mixture prior, initializing the target prompt θ0φˉ=1Mi=1Mφi\theta_0 \leftarrow \bar{\varphi} = \frac{1}{M} \sum_{i=1}^M \varphi_i, with L2 regularization to maintain proximity to this mean (Lee et al., 13 Feb 2024).
  • For vision–LLMs, the sampled ensemble {θk}\{\theta_k\} is retained for inference, and predictions are averaged across the particles, reflecting epistemic model uncertainty and improving generalization (Bendou et al., 21 Nov 2025).

Low-rank and full-rank prompt decompositions can be separately tuned for rapid adaptation to idiosyncratic target task features while conserving global structure.

4. Implementation Details and Algorithmic Structure

ReBaPL instantiates as a modular, plug-and-play optimizer for prompt learning:

  • With SVGD (BMTPT): Particle count MM between $5$–$10$, RBF kernel bandwidth by the median heuristic, learning rate η103\eta \sim 10^{-3}, damped SVGD variant (self-gradient scaling factor λ0.37\lambda \approx 0.37), and up to 10510^5 SVGD updates.
  • With SGHMC (ReBaPL for vision–language): Number of chains KK, number of cycles CC, and repulsion strength ξ\xi are tuned; cyclical step-size schedule (cosine), exploration and exploitation phases, and representation-space distance computed over either MMD or Wasserstein. Existing MLE prompt learners (e.g., CoOp, MaPLe, MMRL) serve as gradient oracles, requiring only optimizer replacement (Bendou et al., 21 Nov 2025).

Algorithm steps include AdamW burn-in, RC-SGHMC loop with representation-based repulsion, cosine scheduling, and cycle-phase-specific noise injection. Predictions for downstream tasks are averaged over the prompt ensemble.

5. Empirical Evaluation

Recent studies provide extensive empirical evidence for ReBaPL’s effectiveness:

Metric/Setting BMTPT (Lee et al., 13 Feb 2024) ReBaPL (Bendou et al., 21 Nov 2025)
Param. efficiency 0.035%0.035\% tuned Match base method’s param. config
GLUE avg. (T5-base) $88.7$ N/A
SGLUE avg. (T5-base) $74.6$ N/A
Few-shot (4–32 shot, NLP) +3+3–$8$ pt over MPT N/A
Base-to-novel HM (16-shot VL) N/A +0.8+0.8–$3.2$ pp over base
Domain gen. (ImageNetV2/S/A/R) N/A +0.3+0.3–$0.9$ pp over base

Key findings include:

  • SVGD repulsion is essential: Removing the k\nabla k term (or using a single particle) causes significant GLUE/SuperGLUE degradation (1\sim1–$2$ points) (Lee et al., 13 Feb 2024).
  • Ensemble diversity matters more than particle count: Raising MM without repulsion hurts transfer.
  • Representation-based repulsion offers generalization gains across OOD, cross-dataset, and few-shot benchmarks, with both MMD and Wasserstein metrics delivering comparable gains (Bendou et al., 21 Nov 2025).

6. Theoretical and Practical Significance

ReBaPL’s core innovation is the explicit maintenance of diversity among prompt samples in the Bayesian posterior, operationalized via repulsive forces. This avoids the mode-seeking collapse typical of mean-field variational methods and ensures the posterior support covers multiple plausible task solutions. Empirically, this leads to improved sample efficiency, transfer, and OOD performance in both language and vision–language prompt tuning. The approach is agnostic to the parameterization and architecture of the base prompt method, supporting broad adoption via optimizer substitution (Lee et al., 13 Feb 2024, Bendou et al., 21 Nov 2025).

A plausible implication is that such repulsive Bayesian strategies will be instrumental as model and prompt spaces grow in scale and multimodality, particularly for few-shot, multitask, and adaptive inference problems.

ReBaPL generalizes prompt tuning by merging ideas from SVGD-based Bayesian deep learning (Lee et al., 13 Feb 2024) and MCMC-based uncertainty quantification (Bendou et al., 21 Nov 2025), augmenting them with bespoke notions of functional diversity customized for prompt-based control in large models. The key distinction from prior Bayesian prompt learning is the repulsive interaction, either via kernel gradients in parameter space (NLP) or representation-space distances (vision-language), setting ReBaPL apart from both unimodal variational approximations and unregularized ensembling.

In conclusion, Repulsive Bayesian Prompt Learning variants provide a principled and practically effective framework for enhancing prompt-based model adaptation and generalization through robust, multimodal posterior inference equipped with diversity-preserving repulsion.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (2)

Whiteboard

Follow Topic

Get notified by email when new papers are published related to Repulsive Bayesian Prompt Learning (ReBaPL).