Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 173 tok/s
Gemini 2.5 Pro 54 tok/s Pro
GPT-5 Medium 28 tok/s Pro
GPT-5 High 30 tok/s Pro
GPT-4o 77 tok/s Pro
Kimi K2 187 tok/s Pro
GPT OSS 120B 440 tok/s Pro
Claude Sonnet 4.5 36 tok/s Pro
2000 character limit reached

FVAE-LoRA: Factorized VAE-based LoRA

Updated 29 October 2025
  • The paper introduces FVAE-LoRA which integrates a VAE into LoRA, enabling dynamic low-rank updates driven by task-salient features.
  • It leverages a two-latent VAE framework to explicitly separate task-relevant signals from residual noise, improving robustness under distribution shifts.
  • Empirical evaluations across text, audio, and image modalities show FVAE-LoRA outperforms standard methods in worst-group accuracy while remaining computationally efficient.

FVAE-LoRA is a parameter-efficient fine-tuning paradigm that integrates a variational autoencoder (VAE) into the low-rank adaptation (LoRA) mechanism. The approach replaces the standard static low-rank matrices with dynamic encodings that explicitly factorize the latent space into task‐salient and residual components. This enables the adaptation process to emphasize causal features and to mitigate spurious correlations, thereby targeting improved downstream performance and robustness under distribution shifts.

1. Theoretical Motivation and Context

FVAE-LoRA extends the core idea of LoRA—injecting trainable low-rank matrices (A and B) into frozen weight matrices (W) via

Wadapted=W+BA\mathbf{W}_{\text{adapted}} = \mathbf{W} + \mathbf{B}\mathbf{A}

—by replacing the static matrix A with a data-dependent latent representation. Standard LoRA lacks explicit mechanisms to disambiguate task-relevant features from residual noise. FVAE-LoRA addresses this limitation by embedding a VAE within the adaptation loop, encouraging the separation of task-salient information from other variabilities.

FVAE-LoRA builds on methodologies found in related works such as IVON-LoRA (Cong et al., 17 Jun 2025) and prior approaches that combine sparse autoencoders with LoRA finetuning (Chen et al., 31 Jan 2025). By leveraging a novel Evidence Lower Bound (ELBO) formulation with a cross-prior regularizer, FVAE-LoRA promotes disentanglement within the learned latent space and improves robustness under distribution shift.

2. Architecture and Mathematical Foundations

FVAE-LoRA replaces the global, static low-rank matrix in standard LoRA with a dynamic matrix generated from a VAE module that factors the information into two latent spaces: one for task-salient features (z1z_1) and one for residual components (z2z_2). The core components are as follows:

  • Two separate encoders qϕ1(z1x)q_{\phi_1}(z_1 \mid x) and qϕ2(z2x)q_{\phi_2}(z_2 \mid x) that process the input xx.
  • Distinct latent priors:
    • p1(z1)N(0,I)p_1(z_1) \sim \mathcal{N}(0, I) for task-relevant representation.
    • p2(z2)N(1.5,I)p_2(z_2) \sim \mathcal{N}(1.5, I) for the residual space, enforcing separation via differing means.
  • A decoder that reconstructs the input from the concatenated latent representations (z1,z2)(z_1, z_2).

The adaptation mechanism uses only z1z_1: a trainable mapping (matrix B\mathbf{B}) is applied to z1z_1 and the resulting update is added to the transformed activation: h^(x)=Wx+Bf(z1)\widehat{h}(x) = \mathbf{W}x + \mathbf{B}f(z_1) This selective usage of z1z_1 ensures that only the task-salient features drive model adaptation.

3. ELBO Formulation with Latent Factorization

The learning objective for FVAE-LoRA is an extension of the classical VAE ELBO. For a two-latent VAE, the standard objective is expressed as: LVAE2LAT(x)=Ez1,z2[logpθ(xz1,z2)]DKL(qϕ1(z1x)p1(z1))DKL(qϕ2(z2x)p2(z2))\mathcal{L}^{\mathrm{VAE2LAT}}(x) = \mathbb{E}_{z_1, z_2}\big[\log p_\theta(x \mid z_1, z_2)\big] - D_{\mathrm{KL}}\big(q_{\phi_1}(z_1 \mid x) \,\|\, p_1(z_1)\big) - D_{\mathrm{KL}}\big(q_{\phi_2}(z_2 \mid x) \,\|\, p_2(z_2)\big) FVAE-LoRA introduces a cross-prior regularizer, Γ\Gamma, to explicitly encourage separation between the latent spaces. The regularizer is defined as: Γ=Ez2[logp2(z2)logp1(z2)]+(Ez2[logp1(z2)]Ez1[logp1(z1)])\Gamma = \mathbb{E}_{z_2}\big[\log p_2(z_2) - \log p_1(z_2)\big] + \Big(\mathbb{E}_{z_2}\big[\log p_1(z_2)\big] - \mathbb{E}_{z_1}\big[\log p_1(z_1)\big]\Big) Incorporating weighting factors α\alpha (reconstruction), β\beta (KL divergence), and δ\delta (repulsion), the final FVAE-LoRA objective becomes

Lθ,ϕFVAE(x)=αEz1,z2[logpθ(xz1,z2)]βDKL(qϕ1(z1x)p1(z1))+δΓ\mathcal{L}^{\mathrm{FVAE}}_{\theta,\phi}(x) = \alpha\, \mathbb{E}_{z_1, z_2}\big[\log p_\theta(x \mid z_1, z_2)\big] - \beta\, D_{\mathrm{KL}}\big(q_{\phi_1}(z_1\mid x) \,\|\, p_1(z_1)\big) + \delta\, \Gamma

For downstream tasks, the FVAE loss is integrated into the total loss for each target layer ll and corresponding activation xlx_l such that only qϕ1(z1x)q_{\phi_1}(z_1 \mid x) contributes to the dynamic low-rank update at inference.

4. Learning Dynamics and Functional Roles

The design of FVAE-LoRA enforces a clear functional differentiation between the two latent spaces:

  • z1z_1: Constrained via the KL divergence term and the downstream task loss, z1z_1 is forced to encode the features that are causally and semantically relevant to the task at hand. This latent variable directly influences the low-rank update applied during adaptation.
  • z2z_2: Dedicated to capturing residual information necessary for accurate input reconstruction, z2z_2 absorbs non-task relevant variability. The introduced repulsive regularizer guarantees that the encoding produced by qϕ2(z2x)q_{\phi_2}(z_2 \mid x) remains distinct from qϕ1(z1x)q_{\phi_1}(z_1 \mid x).

Such a factorized representation not only guides adaptation towards task-salient signals but also reduces the risk of incorporating spurious or misleading correlations, which can harm performance, particularly under distribution shifts.

5. Empirical Performance and Robustness

Empirical evaluations of FVAE-LoRA span text, audio, and image tasks and demonstrate consistent improvements over standard LoRA. Key experimental findings include:

  • Improved worst-group and overall accuracy on benchmarks where spurious correlations pose a challenge.
  • Robustness to distribution shifts, as evidenced by higher worst-group accuracy and lower disparities between subgroups.
  • In natural language tasks, performance on commonsense reasoning (e.g., on Llama-3-8B models) and GLUE benchmarks surpasses both standard LoRA and full fine-tuning.
  • On image classification tasks, FVAE-LoRA slightly exceeds full fine-tuning in average accuracy on multiple datasets while offering the computational benefits inherent to parameter-efficient tuning.

A concise comparison of key aspects is provided in the table below:

Aspect Standard LoRA FVAE-LoRA
Adaptation Signal Global low-rank update Dynamic, data-dependent via VAE (z₁)
Latent Factorization Not enforced Explicit via cross-prior regularizer
Robustness Sensitive to spurious cues Improved under distribution shifts

6. Conclusion

FVAE-LoRA represents a refined integration of variational inference into low-rank adaptation, enabling explicit control over the semantic content of the learned low-rank subspace. By factorizing latent representations into task-salient and residual components and employing an ELBO formulation with a repulsive regularizer, the method directs adaptation toward causal features. Empirical results across multiple modalities confirm that FVAE-LoRA not only enhances task performance and robustness but also preserves interpretability with minimal additional computational overhead. This principled architecture paves the way for further innovations in parameter-efficient tuning, particularly in settings where robustness and feature disentanglement are paramount.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (2)
Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Factorized Variational Autoencoder LoRA (FVAE-LoRA).