Papers
Topics
Authors
Recent
Search
2000 character limit reached

NETI: Non-Equilibrium Thermodynamic Integration

Updated 5 February 2026
  • NETI is a computational technique that estimates Bayes factors by variationally annealing between Bayesian posteriors to reduce estimator variance.
  • It applies non-equilibrium statistical mechanics principles to bypass the high-variance prior regimes inherent in standard thermodynamic integration.
  • NETI achieves significant variance reduction in nested model comparisons through fine-grained discretization and minimized relaxation errors.

Non-Equilibrium Thermodynamic Integration (NETI) is a computational methodology for estimating Bayes factors via marginal likelihood ratios between Bayesian models, with a focus on minimizing estimator variance and discretization error. NETI variationally anneals between posterior distributions by leveraging non-equilibrium statistical mechanics principles, systematically circumventing the high-variance regimes associated with conventional prior-to-posterior thermodynamic integration (TI). This approach yields significant variance reduction when models share parameters, particularly in nested model comparison scenarios (Grzegorczyk et al., 2017).

1. Background: Thermodynamic Integration for Marginal Likelihoods

Thermodynamic integration (TI) is a standard approach to estimate the marginal likelihood p(DM)p(D|M) for a model MM given data DD, where

p(DM)=p(Dθ,M)p(θM)dθ,p(D|M) = \int p(D|\theta, M) p(\theta|M) d\theta,

with θ\theta as the parameter vector. TI constructs a family of tempered "power posteriors":

pt(θD,M)p(Dθ,M)tp(θM),t[0,1],p_t(\theta|D,M) \propto p(D|\theta,M)^t p(\theta|M), \quad t \in [0,1],

with normalization Z(t)Z(t) such that Z(0)=1Z(0)=1 and Z(1)=p(DM)Z(1)=p(D|M). The log marginal likelihood decomposes as

logp(DM)=01Ept[logp(Dθ,M)]dt,\log p(D|M) = \int_0^1 \mathbb{E}_{p_t}[\log p(D|\theta,M)] dt,

numerically integrated by discretizing tt and estimating expectations with MCMC.

A major limitation of TI is the "prior regime" for tt near zero, where pt(θD,M)p_t(\theta|D,M) is dominated by the prior. The Monte Carlo approximation of Ept[logp(Dθ,M)]\mathbb{E}_{p_t}[\log p(D|\theta,M)] in this regime is highly variable, especially when the likelihood is diffuse under the prior or MM has high-dimensional parameter space. This can dominate total estimator variance, requiring impractically fine temperature grids or large MCMC samples for reliable estimation (Grzegorczyk et al., 2017).

2. Direct-Path TI for Model Comparison

For hypothesis testing or model comparison, the target is the Bayes factor BF=p(DM2)/p(DM1)\textrm{BF} = p(D|M_2)/p(D|M_1). The direct-path TI method instead defines an annealing path between the posterior of M1M_1 and M2M_2:

pt(θD;M1,M2)p(Dθ,M2)tp(Dθ,M1)1tp(θM1,M2),p_t(\theta|D; M_1, M_2) \propto p(D|\theta, M_2)^t p(D|\theta, M_1)^{1-t} p(\theta | M_1, M_2),

with a joint prior p(θM1,M2)p(\theta|M_1, M_2) marginalizing to the individual model priors. The path’s partition function is

Z(t)=[p(Dθ,M2)p(Dθ,M1)]tp(Dθ,M1)p(θ)dθ.Z(t) = \int \left[\frac{p(D|\theta, M_2)}{p(D|\theta, M_1)}\right]^t p(D|\theta, M_1) p(\theta) d\theta.

One obtains

ddtlogZ(t)=Ept[logp(Dθ,M2)p(Dθ,M1)].\frac{d}{dt} \log Z(t) = \mathbb{E}_{p_t}\left[\log \frac{p(D|\theta, M_2)}{p(D|\theta, M_1)}\right].

Integrating yields

logBF=01Ept[Δ(θ)]dt,\log \textrm{BF} = \int_0^1 \mathbb{E}_{p_t}[\Delta \ell(\theta)] dt,

where Δ(θ)=logp(Dθ,M2)logp(Dθ,M1)\Delta \ell(\theta) = \log p(D|\theta, M_2) - \log p(D|\theta, M_1). This path systematically avoids the problematic prior regime inherent in standard TI (Grzegorczyk et al., 2017).

3. Non-Equilibrium Thermodynamic Integration Framework

The non-equilibrium TI (NETI) framework adapts statistical mechanical concepts, in particular Jarzynski's equality, to estimate normalizing constant ratios:

BF=exp(W),\textrm{BF} = \langle \exp(-W) \rangle,

where WW is the accumulated "work" along a non-equilibrium protocol as tt evolves from $0$ to $1$. In practice, a single long MCMC trajectory is performed, adiabatically updating tt in fine-grained steps Δt\Delta t, at each recording Δ(θ(t))\Delta \ell(\theta(t)). The continuous path integral

logBF01Δ(θ(t))dt\log \textrm{BF} \approx \int_0^1 \Delta \ell(\theta(t)) dt

is discretized as

logBFk=1KΔ(θ(tk))Δtk,\log \textrm{BF} \approx \sum_{k=1}^K \Delta \ell(\theta(t_k)) \Delta t_k,

with KK large. Discretization error scales as O(maxΔt2)\mathcal{O}(\max \Delta t^2), but becomes negligible as KK increases (typically as many temperature steps as total MCMC iterations). The dominant remaining error component, the "relaxation error," diminishes as O(1/Niter)\mathcal{O}(1/N_{\textrm{iter}}), versus the O(1/Niter)\mathcal{O}(1/\sqrt{N_{\textrm{iter}}}) scaling of standard TI (Grzegorczyk et al., 2017).

4. NETI-DIFF Algorithmic Implementation

The NETI-DIFF algorithm proceeds as follows. For a schedule t1=0<t2<<tK=1t_1=0 < t_2 < \dots < t_K=1 (power-law or sigmoid, as appropriate), the procedure:

  1. Initialize θ(0)\theta(0) by sampling from p(θD,M1)p(\theta|D, M_1).
  2. For k=1k=1 to K1K-1:
    • Set t=tkt=t_k.
    • Perform MCMC update(s) targeting pt(θ)p_t(\theta).
    • Record Δk=logp(Dθ,M2)logp(Dθ,M1)\Delta \ell_k = \log p(D|\theta, M_2) - \log p(D|\theta, M_1).
    • Advance ttk+1t \leftarrow t_{k+1}.
  3. Compute the estimator:

logBF^=k=1K1Δk+Δk+12(tk+1tk).\widehat{\log \textrm{BF}} = \sum_{k=1}^{K-1} \frac{\Delta \ell_k + \Delta \ell_{k+1}}{2} (t_{k+1} - t_k).

This approach leverages non-equilibrium integration, drastically increasing temperature resolution without significant computational overhead since each θ\theta is updated only briefly at each tt (Grzegorczyk et al., 2017).

5. Variance Reduction Theoretical Results

Let VTIV_{\textrm{TI}} and VNETIV_{\textrm{NETI}} respectively denote variance for standard TI and NETI-DIFF estimators. Under mild regularity conditions, when models share pshareddim(θ)p_{\textrm{shared}} \ll \dim(\theta) parameters,

VNETI=O(1/N)V_{\textrm{NETI}} = \mathcal{O}(1/N)

with a prefactor approximately reduced by pshared/dim(θ)p_{\textrm{shared}}/\dim(\theta), while

VTI=O(1/N)V_{\textrm{TI}} = \mathcal{O}(1/N)

with no reduction. This indicates orders-of-magnitude variance reduction for NETI-DIFF in high-overlap (e.g., nested) model scenarios (Grzegorczyk et al., 2017). A plausible implication is that NETI-DIFF particularly excels in Bayesian model selection tasks featuring nested or similar model parametrizations.

6. Empirical Evaluations and Benchmarks

Empirical assessment compared standard TI (trapezoidal), TI with Friel & Pettitt corrections, and NETI-DIFF on the following benchmarks:

  • Radiata pine: Linear, n=42n=42, non-nested regressions, closed-form BF=8.8571\textrm{BF}=8.8571.
  • Pima Indians: Logistic, n=532n=532, nested regressions, BF2.6177\textrm{BF} \approx -2.6177 gold standard.
  • Radiocarbon: Polynomial Bayesian linear regression (orders up to 10), closed-form BF.
  • Biopathway: Nonlinear hierarchical ODE model, network inference, surrogate gold standard.

Performance metrics include average absolute error A=meanBFestBFtrueA = \textrm{mean}|BF_\textrm{est}-BF_\textrm{true}| and variance V=Var(BFest)V = \textrm{Var}(BF_\textrm{est}), with NiterN_{\textrm{iter}} ranging 10410^4 to 10710^7 and KK between 10 and 200. Principal findings:

  • Radiata pine (non-nested): No significant difference, NETI \approx TI.
  • Pima Indians (nested): NETI reduced VV and AA by factors of 5–50.
  • Radiocarbon: NETI reduced VV up to 10310^3 for large model differences.
  • Biopathway: NETI reduced VV by one to two orders of magnitude; improved network-selection accuracy (Grzegorczyk et al., 2017).

7. Practical Guidelines for NETI-DIFF

  • Path design: Use a power-law schedule tk=(k/K)αt_k = (k/K)^\alpha (α5\alpha \approx 5) for nested models, denser near t=0t=0; apply symmetric sigmoid for non-nested comparisons to mitigate end-bias.
  • Number of steps: Set number of temperature steps equal to MCMC iterations; NETI-DIFF obviates need for full equilibrium at each tt.
  • Computation: Similar per-iteration cost to TI; in some cases slightly less for NETI-DIFF due to holding shared parameters constant at t=1t=1.
  • Variance reduction strategies: Combine with control variate techniques (CTI) by applying variance-reduction corrections to Δ(θ)\Delta \ell(\theta) after path construction.
  • Overall effect: NETI-DIFF replaces the dual prior-to-posterior integration with a single posterior1_1-to-posterior2_2 integration, completely bypassing high-variance prior regimes and exploiting fine-grained non-equilibrium annealing schedules for substantial variance reductions in appropriate model-comparison settings (Grzegorczyk et al., 2017).
Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Non-Equilibrium Thermodynamic Integration (NETI).