Reversible-Jump MCMC
- RJMCMC is a method for sampling across models with differing dimensions by using dimension-matching transformations that preserve detailed balance and ergodicity.
- It facilitates Bayesian model selection and nonparametric inference in applications like mixture modeling, geophysical inversion, and change point detection.
- Advanced proposals—including transport-based moves, multiple-try strategies, and adaptive tuning—enhance performance in complex, high-dimensional, or multimodal target spaces.
Reversible-Jump Markov Chain Monte Carlo (RJMCMC)
Reversible-Jump Markov Chain Monte Carlo (RJMCMC) is a class of Markov chain Monte Carlo algorithms designed to sample from posterior distributions defined over a union of model spaces with differing dimensions or structures. RJMCMC extends classical MCMC by enabling transitions between models (or parameter spaces) of varying dimensionality while maintaining detailed balance and ergodicity. This capability is critical in Bayesian model selection, nonparametric inference, transdimensional inverse problems, and a range of applications where the model complexity (e.g., number of basis functions, mixture components, or covariates) is itself an object of inference.
1. Fundamental Principles and Mechanism
The core innovation of RJMCMC is the introduction of dimension-matching transformations—bijections between parameter-auxiliary variable tuples for source and target models—which enable reversible trans-dimensional moves. Consider models indexed by , parameter spaces of dimension , and posteriors . Between-model proposals proceed by:
- Proposing a new model index .
- Sampling an auxiliary variable of dimension .
- Applying a diffeomorphic mapping , ensuring dimensionality matches ().
- Discarding , retaining only the proposed 0.
The acceptance probability is
1
ensuring reversibility with respect to the joint posterior (Yin et al., 14 Dec 2025).
Within-model moves (when 2) reduce to classical Metropolis–Hastings updates.
2. Dimension-Matching and Move Design
Dimension-matching is achieved via the construction of invertible maps between 3 and 4, which can be as simple as concatenation (for birth moves: appending new parameters sampled from a prior) or more general deterministic bijections (e.g., split/merge operations in mixture modeling or Gram–Schmidt orthogonalizations for latent matrix dimensions) (Dey et al., 2018, Karakuş et al., 2017, Luo, 2010).
Key move types include:
- Birth/death: Adding or removing a parameter with the identity mapping for newly introduced or discarded coordinates;
- Split/merge: Splitting or merging mixture components, typically necessitating complex mappings and non-trivial Jacobian determinants;
- Switches between structurally distinct models ("trans-space" RJMCMC): Moving between models with different functional forms or families by defining invertible parameter transformations that preserve certain summary statistics (e.g., moments) (Karakuş et al., 2017).
The Jacobian determinant accounts for volume change under these variable transformations and is central to ensuring proper weighting in the acceptance ratio.
3. Advanced Proposal Strategies and Algorithmic Extensions
RJMCMC efficiency can be severely compromised by poor mixing in high-dimensional, correlated, or multimodal target spaces. To ameliorate these issues, several proposal strategies and algorithmic extensions have been introduced:
- Transport-based RJMCMC: Nonlinear normalizing flows 5 are trained to map posterior distributions of each model to a reference distribution (e.g., standard normal). Model transitions operate in the reference space using additional dimension-matching transformations, then the inverse flow returns the proposal to the native parameterization. Under conditions of exact transport, all Jacobian and proposal density terms cancel except for the model probabilities, yielding theoretically optimal (model-only) acceptance (Davies et al., 2022, Yin et al., 14 Dec 2025).
- Multiple-Try and Interpolation Proposals: Drawing several candidate parameters for the target model and selecting among them according to locally adaptive weights (e.g., using quadratic approximations of the posterior), as in the GMTRJ algorithm, can enhance acceptance probability and mixing (Pandolfi et al., 2010). Alternatively, using kD-tree or covariance-adaptive proposals built from single-model MCMC samples approximates the high-density regions of each posterior and substantially increases inter-model move acceptance (Farr et al., 2011).
- Parallel Tempering and Simulated Annealing: Embedding RJMCMC moves within a simulated annealing or auxiliary-tempered framework (with deterministic or non-reversible temperature swaps) helps concentrate the chain on the global optima and explores multimodal landscapes more efficiently (Tian et al., 2024, Dey et al., 2018, Andrieu et al., 2013).
- Non-standard Continuous-Time and PDMP Methods: Piecewise deterministic Markov processes with RJ capabilities allow for non-reversible, computationally efficient transdimensional exploration without explicit Metropolis–Hastings accept–reject steps, maintaining the correct invariant distribution by matching probability flow rates between models (Chevallier et al., 2020).
4. Applications: Model Selection, Nonparametric Inference, and Beyond
RJMCMC has been foundational for Bayesian model selection across a broad range of application domains. Notable examples include:
- Hierarchical Bayesian models for recommender systems: Dimensionality selection for latent-factor models of user-movie data, where RJMCMC allows for automatic inference of the number of latent features while tuning regularization via empirical Bayes (Dey et al., 2018).
- Nonparametric drift estimation: Inferring the effective number of basis functions in stochastic differential equation modeling, directly integrating over both model complexity and parameter uncertainty (Meulen et al., 2012).
- Geophysical inversion: Inferring object shapes from gravity anomaly data by letting the number of polygon vertices be random and parsimony-prior controlled (Luo, 2010).
- Change point detection with heterogeneous dynamics: Extensions of RJMCMC for time series with short-lived events (e.g., blinking states in microscopy), using compound moves to insert or remove pairs of closely spaced change points according to an exponential duration prior (Gribbin et al., 19 Feb 2026).
- Trans-space and trans-distributional learning: Joint exploration of model families not connected by dimensionality alone, e.g., selecting among heavy-tailed and Gaussian noise models via deterministically invertible mappings preserving key data features (Karakuş et al., 2017).
In each domain, careful move design, prior specification, and algorithmic augmentation tailored to the target inferential goals are critical.
5. Empirical Bayes and Adaptive Schemes
Efficient RJMCMC often requires adaptive or empirical Bayes tuning of regularization hyperparameters, particularly in models with hierarchical structure. This is elegantly exemplified in the use of stochastic approximation (e.g., Robbins–Monro or Adam-like momentum) for updating regularization coefficients within the RJMCMC procedure: after each accepted move, empirical gradient information (as from 6 norms of factors) increments the hyperparameters towards values that maximize marginal likelihood or predictive performance (Dey et al., 2018).
Adaptation strategies are also crucial for proposal variances and temperature scheduling (in parallel tempering or simulated annealing), and for covariance adaptation across model spaces in curve-fitting or inverse problems (Tian et al., 2024).
6. Theoretical Guarantees, Convergence, and Tuning
Formal results establish the weak convergence properties of RJMCMC chains under scalable targets, including optimal scaling (the 7 rule for random-walk acceptance rates), and asymptotic inefficiency/optimal mix of move-type proposals depending on model space size and prior mass ratios (Gagnon et al., 2016). Establishing conditions for irreducibility, aperiodicity, and convergence to the stationary posterior law is complicated by the complexity of the trans-dimensional structure, but has been analyzed in detail for both discrete and continuous-time variants.
In practice, empirical performance is measured by acceptance rates for between-model moves, effective sample size per second, and the accuracy of estimated posterior model probabilities, with sophisticated proposals (transport maps, multiple-try) yielding orders-of-magnitude improvements over uninformed or random-walk strategies (Davies et al., 2022, Pandolfi et al., 2010, Yin et al., 14 Dec 2025, Farr et al., 2011).
7. Generalizations and Future Directions
Recent innovations extend RJMCMC beyond its original scope in several directions:
- Noisy RJMCMC: For doubly-intractable posteriors (e.g., normalization constants not available), the acceptance probability is computed using unbiased Monte Carlo estimators, with convergence guarantees provided the estimator variance is controlled (Bouranis et al., 2017).
- Fixed-dimensional alternatives: Mixtures of mutually singular distributions (MoMS) reparametrize variable selection within a fixed-dimensional space, recovering the same acceptance probabilities as RJMCMC under suitable proposals, and offering practical implementation advantages (Bergh et al., 30 Apr 2026).
- Integration with deep learning and normalizing flows: VI-NF-based proposal construction for both within- and between-model moves enables highly expressive, adaptive, and scalable RJMCMC sampling for modern inference problems (Yin et al., 14 Dec 2025).
Open problems concern algorithmic automation for arbitrary model spaces, error-robustness in approximate transports, and further scalability to very large state spaces as encountered in modern statistical and machine learning applications.
References
- (Dey et al., 2018): A novel Empirical Bayes with Reversible Jump Markov Chain in User-Movie Recommendation system
- (Karakuş et al., 2017): Beyond trans-dimensional RJMCMC with a case study in impulsive data modeling
- (Luo, 2010): Constraining the shape of a gravity anomalous body using reversible jump Markov chain Monte Carlo
- (Yin et al., 14 Dec 2025): Transport Reversible Jump Markov Chain Monte Carlo with proposals generated by Variational Inference with Normalizing Flows
- (Farr et al., 2011): An Efficient Interpolation Technique for Jump Proposals in Reversible-Jump Markov Chain Monte Carlo Calculations
- (Barker et al., 2010): Posterior model probabilities computed from model-specific Gibbs output
- (Tian et al., 2024): Adaptive tempered reversible jump algorithm for Bayesian curve fitting
- (Meulen et al., 2012): Reversible jump MCMC for nonparametric drift estimation for diffusion processes
- (Davies et al., 2022): Transport Reversible Jump Proposals
- (Chevallier et al., 2020): Reversible Jump PDMP Samplers for Variable Selection
- (Bouranis et al., 2017): Model comparison for Gibbs random fields using noisy reversible jump Markov chain Monte Carlo
- (Gagnon et al., 2016): Weak Convergence and Optimal Tuning of the Reversible Jump Algorithm
- (Andrieu et al., 2013): Reversible Jump MCMC Simulated Annealing for Neural Networks
- (Pandolfi et al., 2010): A generalized Multiple-try Metropolis version of the Reversible Jump algorithm
- (Bergh et al., 30 Apr 2026): Reversible Jump MCMC With No Regrets: Bayesian Variable Selection Using Mixtures of Mutually Singular Distributions
- (Gribbin et al., 19 Feb 2026): An extension to reversible jump Markov chain Monte Carlo for change point problems with heterogeneous temporal dynamics