Limiting worst-case behavior of empirical effective sample size under weight harmonization

Determine whether the empirical effective sample size (defined as ess(W)=1 divided by the sum of squared harmonized weights) produced by the weight-harmonization algorithm for coupled Markov chain Monte Carlo chains converges to a non-degenerate worst-case limiting regime that still provides useful convergence bounds, and characterize this limit precisely in terms of the number of chains and mixing behavior.

Background

In the Ornstein–Uhlenbeck experiment, the authors observe that the empirical effective sample size derived from the harmonized weights is systematically conservative compared to the theoretical chi-squared-based benchmark, that the underestimation does not improve with mixing speed after rescaling, and that the degradation worsens with the number of chains. These observations suggest the existence of a limiting regime governing the diagnostic’s conservativeness.

Establishing and characterizing a non-degenerate worst-case limit for the empirical effective sample size would calibrate the diagnostic’s behavior, clarify its conservativeness, and guide practical interpretation and methodological improvements.

References

In practice, we conjecture that the empirical effective sample size degradation converges to a non-degenerate ``worst case'' scenario which still provides useful bounds on the convergence of the system.

— A coupling-based approach to f-divergences diagnostics for Markov chain Monte Carlo (2510.07559 - Corenflos et al., 8 Oct 2025) in Section 5.1 (A fully tractable system)

Limiting worst-case behavior of empirical effective sample size under weight harmonization

Sponsor

Background

References

Related Problems