Feedback mechanism for swap dynamics that captures mixing objectives

Develop a swap-dependent feedback mechanism for parallel tempering Markov Chain Monte Carlo that effectively captures the objective of reducing autocorrelation and improving mixing, suitable for use as a reward or control signal when adaptively selecting temperatures.

Background

Sampler efficiency in parallel tempering is typically assessed via integrated autocorrelation time, but using ACT directly as a feedback signal is impractical during sampling. Common alternatives such as acceptance-rate uniformity or ESJD either focus on swap efficiency without fully reflecting mixing impact or can be computationally burdensome.

The paper proposes the swap mean-distance metric as a proxy, showing strong empirical correlations with ACT. Nonetheless, the authors explicitly note that constructing a feedback mechanism that reliably captures the mixing objective during swapping remains an open challenge, motivating continued investigation into more effective reward designs.

References

Frequently swapping states with hotter chains in parallel tempering helps reduce autocorrelation. However, developing a feedback mechanism for the swap mechanism that effectively captures this goal remains an open challenge.

Policy Gradients for Optimal Parallel Tempering MCMC (2409.01574 - Zhao et al., 3 Sep 2024) in Section "Swap Mean Distance"