Transport Reversible Jump Markov Chain Monte Carlo with proposals generated by Variational Inference with Normalizing Flows (2512.12742v1)

Published 14 Dec 2025 in stat.ML and cs.LG

Abstract: We present a framework using variational inference with normalizing flows (VI-NFs) to generate proposals of reversible jump Markov chain Monte Carlo (RJMCMC) for efficient trans-dimensional Bayesian inference. Unlike transport reversible jump methods relying on forward KL minimization with pilot MCMC samples, our approach minimizes the reverse KL divergence which requires only samples from a base distribution, eliminating costly target sampling. The method employs RealNVP-based flows to learn model-specific transport maps, enabling construction of both between-model and within-model proposals. Our framework provides accurate marginal likelihood estimates from the variational approximation. This facilitates efficient model comparison and proposal adaptation in RJMCMC. Experiments on illustrative example, factor analysis and variable selection tasks in linear regression show that TRJ designed by VI-NFs achieves faster mixing and more efficient model space exploration compared to existing baselines. The proposed algorithm can be extended to conditional flows for amortized vairiational inference across models. Code is available at https://github.com/YinPingping111/TRJ_VINFs.

Summary

The paper introduces a VI-NF based transport RJMCMC algorithm that enhances proposal design by eliminating the need for pilot posterior sampling.
The methodology leverages invertible normalizing flows to learn per-model transport maps, which simplify trans-dimensional moves and improve acceptance rates.
Empirical results demonstrate faster mixing, low-variance marginal likelihood estimates, and robust model selection across complex Bayesian inference benchmarks.

Transport Reversible Jump MCMC with Proposals Generated by Variational Inference with Normalizing Flows

Introduction and Context

The paper introduces a new methodology for trans-dimensional Bayesian inference by integrating variational inference with normalizing flows (VI-NFs) into the design of reversible jump Markov chain Monte Carlo (RJMCMC) proposals. The fundamental problem addressed is efficient sampling and model comparison in Bayesian model selection settings where the posterior support is a union of spaces of varying dimensions. Traditional RJMCMC approaches face significant challenges in exploring such spaces, often impaired by measure-theoretic issues, inefficient proposal designs, and severe mixing limitations when the target posterior exhibits complex geometry or multimodality.

Recent advances in transport-based proposals—most notably the transport reversible jump (TRJ) framework—transform the sampling problem using deterministic, invertible transport maps, often learned via normalizing flows. However, existing methods typically rely on forward Kullback-Leibler (KL) objectives requiring target posterior samples, thus mandating expensive MCMC pre-runs with their own bias and variance limitations.

This work proposes a substantial modification: learning the transport maps with VI-NFs by minimizing the reverse KL divergence with respect to simple base distributions, thereby removing the need for pilot MCMC sampling from the target and yielding both computational and practical advantages.

Methodological Framework

Trans-dimensional RJMCMC and Transport Maps

The RJMCMC algorithm constructs moves between subspaces of different dimensions using diffeomorphic transformations plus auxiliary variables for dimension matching. A significant theoretical advance is that, with exact transport maps, the acceptance probability for trans-dimensional moves depends only on the prior and proposal probabilities over models, simplifying to:

$a(x, x') = \frac{\pi(k') q(k | k')}{\pi(k) q(k' | k)}$

and further, under detailed balance, trans-dimensional moves become automatically accepted.

Variational Inference with Normalizing Flows

VI approximates the intractable posterior $p^*$ by an expressive variational family $q_\eta$ , parameterized using normalizing flows $f_\eta$ . VI minimizes the reverse KL:

$\mathrm{KL}(q_\eta || p^*) = \mathbb{E}_{q_\eta} \left[ \log \frac{q_\eta(\theta)}{p^*(\theta)} \right]$

Normalizing flows map samples from a base distribution through a sequence of invertible transformations with easily computable Jacobians, enabling scalable and tractable density modeling. The architecture incorporates RealNVP, masked autoregressive flows, and neural spline flows, with auxiliary mechanisms (e.g., softplus transforms, heavy-tailed base families) to accommodate parameter constraints and target heavy tails.

Transport RJMCMC with VI-NF Proposals

The proposed framework, RJMCMC-VINF, consists of the following essential steps:

Learning per-model normalizing flows: For each conditional posterior, train a specific transport map with VI-NFs by minimizing reverse KL divergence, using only samples from a base (e.g., Student’s t or Gaussian) distribution.
Between-model proposals: Given a state in model $k$ , transform into the latent space, concatenate or project with auxiliary variables to match the target model's latent dimension, and invert the target model's transport map to obtain a valid proposal in model $k'$ .
Within-model proposals: Leverage similar flow architectures for internal model moves, akin to NeuTra MCMC, but leveraging RealNVP for computational efficiency.
Conditional flows for amortized learning: Exploit conditional architectures to parameterize transport maps as functions of the model index $k$ , reducing the number of flows required and enabling efficient amortized inference across the model space.

The framework inherently provides accurate estimates of the marginal likelihood via importance sampling on top of flow-based densities, crucial for principled Bayes factor and model comparison calculations.

Numerical Experiments

Illustrative Example: Synthetic Trans-dimensional Target

Experiments on nonlinear pushforward models (Sinh-Arcsinh transformations) demonstrate that VI-NF-based transport proposals closely match the exact oracle transport, outperforming affine and spline-based alternatives. The resulting RJMCMC chains display faster mixing and more robust convergence in marginal model probabilities.

Bayesian Factor Analysis

On a real-world six-dimensional currency exchange rate dataset, the method is benchmarked on model determination between two- and three-factor Bayesian factor analysis models. The VI-NF-based proposals achieve rapid convergence and low-variance model posterior estimates, matching or exceeding the performance of both classical independence samplers and TRJ variants that use pre-trained target-sampled flows. Qualitative analysis of proposal distributions confirms improved geometric fidelity to the true targets.

Robust Regression with Variable Selection

Variable selection with block-structured covariates and heavy-tailed, multimodal error models serves as a rigorous test for trans-dimensional proposal quality. Across multiple data sizes and ground-truth settings, VI-NFs consistently yield model probability estimates with minimal variance and negligible bias against SMC-based ground truths, outperforming both affine and ROMA-NF-based TRJ methods. Conditional normalizing flows deliver equivalent accuracy with increased computational efficiency.

Theoretical and Practical Implications

The adoption of the reverse KL objective in transport map learning eliminates critical dependencies on representative pilot posterior samples—a common bottleneck in high-dimensional, multimodal, or nonstandard geometry settings where pilot MCMC runs may be unreliable or infeasible. The architectural preference for RealNVP flows ensures scalability and computational tractability.

The ability to produce highly accurate marginal likelihood estimates via importance sampling on flow densities is valuable for both Markov chain adaptation and model selection, offering a path to fully automated and amortized RJMCMC.

Moreover, the conditional flow construction paves the way for amortized inferential schemes where a single model-indexed flow supports efficient inference across a combinatorially large model space, crucial for high-dimensional Bayesian model determination tasks.

Future Directions

Several developments are suggested by the results of this work:

General amortization across broader model classes: Further scaling of conditional flows to massive model spaces in variable selection, structural learning, and large graphical models.
Extensions to non-Bayesian and dynamic settings: Adapting the methodology for nonparametric Bayesian models, online Bayesian updating, and state space models with time-varying structures.
Robustness for pathological targets: Investigations into flow architectures and base measures for distributions with severe tail dependence, singularities, or high multimodality.
Theoretical convergence analysis: Formal study of optimality, ergodicity, and asymptotic efficiency in VI-NF-based transport MCMC samplers.

Conclusion

The integration of variational inference with normalizing flows into the TRJ/RJMCMC proposal paradigm marks a technically significant step toward practical, efficient, and fully automated trans-dimensional Bayesian inference. This approach circumvents the limitations associated with pilot MCMC sampling in flow training, achieves high-quality transport maps for both within- and between-model proposals, and supports fast, stable model probability and marginal likelihood estimation. The empirical results confirm consistent performance advantages across a spectrum of canonical Bayesian inference benchmarks. These developments directly inform the construction of scalable, robust, and amortized MCMC frameworks for complex model spaces in contemporary statistical and machine learning applications.

This summary is based on "Transport Reversible Jump Markov Chain Monte Carlo with proposals generated by Variational Inference with Normalizing Flows" (2512.12742).