- The paper introduces a VI-NF based transport RJMCMC algorithm that enhances proposal design by eliminating the need for pilot posterior sampling.
- The methodology leverages invertible normalizing flows to learn per-model transport maps, which simplify trans-dimensional moves and improve acceptance rates.
- Empirical results demonstrate faster mixing, low-variance marginal likelihood estimates, and robust model selection across complex Bayesian inference benchmarks.
Transport Reversible Jump MCMC with Proposals Generated by Variational Inference with Normalizing Flows
Introduction and Context
The paper introduces a new methodology for trans-dimensional Bayesian inference by integrating variational inference with normalizing flows (VI-NFs) into the design of reversible jump Markov chain Monte Carlo (RJMCMC) proposals. The fundamental problem addressed is efficient sampling and model comparison in Bayesian model selection settings where the posterior support is a union of spaces of varying dimensions. Traditional RJMCMC approaches face significant challenges in exploring such spaces, often impaired by measure-theoretic issues, inefficient proposal designs, and severe mixing limitations when the target posterior exhibits complex geometry or multimodality.
Recent advances in transport-based proposals—most notably the transport reversible jump (TRJ) framework—transform the sampling problem using deterministic, invertible transport maps, often learned via normalizing flows. However, existing methods typically rely on forward Kullback-Leibler (KL) objectives requiring target posterior samples, thus mandating expensive MCMC pre-runs with their own bias and variance limitations.
This work proposes a substantial modification: learning the transport maps with VI-NFs by minimizing the reverse KL divergence with respect to simple base distributions, thereby removing the need for pilot MCMC sampling from the target and yielding both computational and practical advantages.
Methodological Framework
Trans-dimensional RJMCMC and Transport Maps
The RJMCMC algorithm constructs moves between subspaces of different dimensions using diffeomorphic transformations plus auxiliary variables for dimension matching. A significant theoretical advance is that, with exact transport maps, the acceptance probability for trans-dimensional moves depends only on the prior and proposal probabilities over models, simplifying to:
a(x,x′)=π(k)q(k′∣k)π(k′)q(k∣k′)
and further, under detailed balance, trans-dimensional moves become automatically accepted.
Variational Inference with Normalizing Flows
VI approximates the intractable posterior p∗ by an expressive variational family qη, parameterized using normalizing flows fη. VI minimizes the reverse KL:
KL(qη∣∣p∗)=Eqη[logp∗(θ)qη(θ)]
Normalizing flows map samples from a base distribution through a sequence of invertible transformations with easily computable Jacobians, enabling scalable and tractable density modeling. The architecture incorporates RealNVP, masked autoregressive flows, and neural spline flows, with auxiliary mechanisms (e.g., softplus transforms, heavy-tailed base families) to accommodate parameter constraints and target heavy tails.
Transport RJMCMC with VI-NF Proposals
The proposed framework, RJMCMC-VINF, consists of the following essential steps:
- Learning per-model normalizing flows: For each conditional posterior, train a specific transport map with VI-NFs by minimizing reverse KL divergence, using only samples from a base (e.g., Student’s t or Gaussian) distribution.
- Between-model proposals: Given a state in model k, transform into the latent space, concatenate or project with auxiliary variables to match the target model's latent dimension, and invert the target model's transport map to obtain a valid proposal in model k′.
- Within-model proposals: Leverage similar flow architectures for internal model moves, akin to NeuTra MCMC, but leveraging RealNVP for computational efficiency.
- Conditional flows for amortized learning: Exploit conditional architectures to parameterize transport maps as functions of the model index k, reducing the number of flows required and enabling efficient amortized inference across the model space.
The framework inherently provides accurate estimates of the marginal likelihood via importance sampling on top of flow-based densities, crucial for principled Bayes factor and model comparison calculations.
Numerical Experiments
Illustrative Example: Synthetic Trans-dimensional Target
Experiments on nonlinear pushforward models (Sinh-Arcsinh transformations) demonstrate that VI-NF-based transport proposals closely match the exact oracle transport, outperforming affine and spline-based alternatives. The resulting RJMCMC chains display faster mixing and more robust convergence in marginal model probabilities.
Bayesian Factor Analysis
On a real-world six-dimensional currency exchange rate dataset, the method is benchmarked on model determination between two- and three-factor Bayesian factor analysis models. The VI-NF-based proposals achieve rapid convergence and low-variance model posterior estimates, matching or exceeding the performance of both classical independence samplers and TRJ variants that use pre-trained target-sampled flows. Qualitative analysis of proposal distributions confirms improved geometric fidelity to the true targets.
Robust Regression with Variable Selection
Variable selection with block-structured covariates and heavy-tailed, multimodal error models serves as a rigorous test for trans-dimensional proposal quality. Across multiple data sizes and ground-truth settings, VI-NFs consistently yield model probability estimates with minimal variance and negligible bias against SMC-based ground truths, outperforming both affine and ROMA-NF-based TRJ methods. Conditional normalizing flows deliver equivalent accuracy with increased computational efficiency.
Theoretical and Practical Implications
The adoption of the reverse KL objective in transport map learning eliminates critical dependencies on representative pilot posterior samples—a common bottleneck in high-dimensional, multimodal, or nonstandard geometry settings where pilot MCMC runs may be unreliable or infeasible. The architectural preference for RealNVP flows ensures scalability and computational tractability.
The ability to produce highly accurate marginal likelihood estimates via importance sampling on flow densities is valuable for both Markov chain adaptation and model selection, offering a path to fully automated and amortized RJMCMC.
Moreover, the conditional flow construction paves the way for amortized inferential schemes where a single model-indexed flow supports efficient inference across a combinatorially large model space, crucial for high-dimensional Bayesian model determination tasks.
Future Directions
Several developments are suggested by the results of this work:
- General amortization across broader model classes: Further scaling of conditional flows to massive model spaces in variable selection, structural learning, and large graphical models.
- Extensions to non-Bayesian and dynamic settings: Adapting the methodology for nonparametric Bayesian models, online Bayesian updating, and state space models with time-varying structures.
- Robustness for pathological targets: Investigations into flow architectures and base measures for distributions with severe tail dependence, singularities, or high multimodality.
- Theoretical convergence analysis: Formal study of optimality, ergodicity, and asymptotic efficiency in VI-NF-based transport MCMC samplers.
Conclusion
The integration of variational inference with normalizing flows into the TRJ/RJMCMC proposal paradigm marks a technically significant step toward practical, efficient, and fully automated trans-dimensional Bayesian inference. This approach circumvents the limitations associated with pilot MCMC sampling in flow training, achieves high-quality transport maps for both within- and between-model proposals, and supports fast, stable model probability and marginal likelihood estimation. The empirical results confirm consistent performance advantages across a spectrum of canonical Bayesian inference benchmarks. These developments directly inform the construction of scalable, robust, and amortized MCMC frameworks for complex model spaces in contemporary statistical and machine learning applications.
This summary is based on "Transport Reversible Jump Markov Chain Monte Carlo with proposals generated by Variational Inference with Normalizing Flows" (2512.12742).