Papers
Topics
Authors
Recent
2000 character limit reached

Symmetric Splitting Method for mCCAdL

Updated 1 January 2026
  • The method delivers second-order accuracy by decomposing dynamics into five operator flows for enhanced stability.
  • It leverages Strang splitting to enable larger integration steps, reducing gradient evaluations while maintaining efficiency.
  • Empirical results confirm improved performance in Bayesian sampling tasks, outperforming first-order integrators in various benchmarks.

The symmetric splitting method for the modified covariance-controlled adaptive Langevin (mCCAdL) thermostat is a second-order accurate integrator for large-scale Bayesian sampling, leveraging Strang splitting to organize the numerical propagation of the mCCAdL system into sub-steps corresponding to distinct physical and stochastic processes. This approach supersedes the original first-order Euler discretization in CCAdL, providing enhanced stability and allowing for substantially larger integration steps while maintaining accuracy and efficiency. Implementation of the symmetric splitting revolves around decomposing the system’s dynamics into five operator flows—Hamiltonian drift, stochastic gradient, covariance-controlled momentum scaling, Ornstein–Uhlenbeck thermostat, and Nosé–Hoover thermostat—executed in a pre-designed reversible sequence which yields superior numerical properties (Wei et al., 30 Dec 2025).

1. Formulation of mCCAdL Dynamics and Operator Splitting

The continuous-time mCCAdL system consists of coupled stochastic differential equations governing the evolution of position qRnq \in \mathbb{R}^n (parameters), momentum pRnp \in \mathbb{R}^n, and thermostat variable ξR\xi \in \mathbb{R}, formulated as:

dq=M1pdt dp=U(q)dt+hΣ(q)dWh2βΣ(q)pdtξpdt+2Aβ1M1/2dWA dξ=μ1(pM1pnkBT)dt\begin{aligned} dq &= M^{-1}p\,dt \ dp &= -\nabla U(q)\,dt + \sqrt{h\,\Sigma(q)}\,dW - \frac{h}{2}\beta\,\Sigma(q)\,p\,dt - \xi\,p\,dt + \sqrt{2A\beta^{-1}}\,M^{1/2}dW_A \ d\xi &= \mu^{-1}(p^\top M^{-1} p - n k_B T)\,dt \end{aligned}

with U(q)U(q) as the potential, Σ(q)\Sigma(q) the covariance of stochastic-gradient noise, MM the mass matrix, β\beta the inverse temperature, AA artificial friction, and μ\mu thermal mass (Wei et al., 30 Dec 2025).

The generator is split into five sub-operators ("A", "B", "C", "O", "D"):

Operator Physical Role Update
A Hamiltonian drift in qq qq+tM1pq \leftarrow q + t M^{-1}p
B Stochastic gradient force pp+tF~(q)p \leftarrow p + t \widetilde{F}(q)
C Covariance-controlled momentum scaling pexp[t(h/2)βΣ(q)]pp \leftarrow \exp[-t(h/2)\beta\Sigma(q)]p
O Ornstein–Uhlenbeck friction/noise on pp pp via explicit OU solution
D Nosé–Hoover thermostat for ξ\xi ξξ+tμ1(pM1pnkBT)\xi \leftarrow \xi + t\mu^{-1}(p^\top M^{-1}p - nk_BT)

2. High-Order Flows: Analytical and Numerical Steps

For operators A, B, D, and O, exact closed-form solutions are available. The C flow (covariance scaling) involves the matrix exponential exp[θΣ(q)]p\exp[-\theta\Sigma(q)]p with θ=t(h/2)β\theta = t(h/2)\beta, which is numerically approximated by a high-order scaling & squaring plus truncated Taylor series method (Wei et al., 30 Dec 2025):

  1. Diagonal shift for numerical stability: Σ~=θΣ\widetilde{\Sigma} = -\theta\Sigma, $\mũ = \operatorname{tr}(\widetilde{\Sigma})/n$.
  2. Centering: $\widehat{\Sigma} = \widetilde{\Sigma} - \mũ I$.
  3. Taylor polynomial: exp(Σ^)(j=0m1j!(Σ^/s)j)s\exp(\widehat{\Sigma}) \approx \left(\sum_{j=0}^m \frac{1}{j!} (\widehat{\Sigma}/s)^j\right)^s for chosen (m,s)(m,s).
  4. Evaluate vk+1=j=0m1j!(Σ^/s)jvkv_{k+1}=\sum_{j=0}^m \frac{1}{j!} (\widehat{\Sigma}/s)^j v_k iteratively, $p = e^{\mũ}v_s$. This provides a second-order accurate solution to the scaling flow for pp.

3. Strang Splitting Sequence for mCCAdL Integration

The symmetric splitting organizes the five operator flows in the precise BAO D C D OAB sequence for each time-step hh:

exp(hL)φBh/2φAh/2φOh/2φDh/2φChφDh/2φOh/2φAh/2φBh/2\exp(hL) \approx \varphi_B^{h/2} \circ \varphi_A^{h/2} \circ \varphi_O^{h/2} \circ \varphi_D^{h/2} \circ \varphi_C^h \circ \varphi_D^{h/2} \circ \varphi_O^{h/2} \circ \varphi_A^{h/2} \circ \varphi_B^{h/2}

This symmetric composition guarantees inherently reversible propagation. Each flow is either exact or high-order, making the total method second-order weakly accurate for the invariant measure sampled by the stochastic system (Wei et al., 30 Dec 2025).

4. Algorithmic Realization and Stepwise Updates

The following pseudocode summarizes one step of symmetric splitting for mCCAdL:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
for each iteration:
    p  p + (h/2) · F̃(q)         # B-half-step
    q  q + (h/2) · M¹ · p      # A-half-step
    Update p by OU (O-half-step) # O
    xi  xi + (h/2)/mu * [pᵀ M¹ p - n k_B T] # D
    # Covariance scaling (C, full-step)
    Compute Σ(q)
    theta = (h/2) * beta
    Σ̃ = -theta Σ(q)
    mũ = trace(Σ̃)/n
    Σ̂ = Σ̃ - mũ I
    Iterate v_k with Taylor series
    p  e^{mũ} v_s
    Repeat D- and O-half-steps
    q  q + (h/2) · M¹ · p
    p  p + (h/2) · F̃(q)

Only one computation of the stochastic gradient and one computation of the covariance Σ(q)\Sigma(q) are required per iteration, preserving efficiency compared to CCAdL (Wei et al., 30 Dec 2025).

5. Numerical Stability, Accuracy, and Performance

The symmetric splitting scheme confers marked improvements:

  • Stability Bound Enhancement: Step-size hh limits for mCCAdL are up to 10100×10-100\times those of CCAdL. In Bayesian linear regression, mCCAdL is stable at h5×103h \approx 5 \times 10^{-3}10210^{-2}, while CCAdL blows up beyond h>103h > 10^{-3}. Similar gains are observed in MNIST, CIFAR-10 logistic regression, and discriminative RBM training (Wei et al., 30 Dec 2025).
  • Second-Order Convergence: Weakly second-order accuracy is demonstrated for the invariant measure.
  • Efficiency: Larger hh directly reduces the overall gradient evaluations required for a target accuracy.
Application hmaxh_{\max} CCAdL hmaxh_{\max} mCCAdL Step-size Gain
Bayesian linear regression 10310^{-3} 5×1035\times10^{-3}10210^{-2} $5$–10×10\times
MNIST logistic regression 1×1041\times10^{-4} 1.2×1031.2\times10^{-3} 12×12\times
CIFAR-10 logistic regression 1×1041\times10^{-4} 1.5×1031.5\times10^{-3} 15×15\times
RBM training <3×102<3\times10^{-2} 3×1023\times10^{-2} 10×10\times

A plausible implication is that mCCAdL is well-suited for scenarios where per-step computational cost is critical and variance in the stochastic-gradient estimator is substantial.

6. Context: Relation to Classical Strang Splitting and Operator Splitting in Kinetic Schemes

The mCCAdL symmetric splitting method is conceptually linked to the Strang-splitting principle developed for operator-split kinetic equations. In cascaded Lattice Boltzmann methods (LBM) for fluid and scalar transport, the analogous symmetric operator-split methodology achieves second-order temporal accuracy and eliminates spurious force/source artifacts by projecting forcing only onto the central moments associated with conserved quantities (Hajabdollahi et al., 2018).

In classical Strang splitting, for two non-commuting operators PP, QQ, the propagator over Δt\Delta t is e(Δt/2)PeΔtQe(Δt/2)Pe^{(\Delta t/2)P} e^{\Delta t Q} e^{(\Delta t/2)P}, yielding O(Δt2)O(\Delta t^2) global accuracy. In mCCAdL, this philosophy is adapted to stochastic gradient sampling, with stability and accuracy gains paralleling those established in LBM operator-splitting contexts.

7. Benchmarking and Empirical Results

Extensive computational experiments substantiate the advantages of symmetric splitting for mCCAdL:

  • mCCAdL attains the lowest 2-Wasserstein distance in Bayesian linear regression across all hh.
  • In MNIST/CIFAR-10 classification, mCCAdL maintains high predictive log-likelihood and test accuracy at step-sizes where CCAdL and alternatives become unstable.
  • Posterior mean log-loss in binary tasks is consistently minimized by mCCAdL, with CCAdL yielding NaNs beyond its critical hh.
  • Discriminative RBM multiclass training with mCCAdL at h3×102h \sim 3\times10^{-2} achieves lowest test error; alternatives deteriorate or become unstable.

All empirical results confirm second-order weak accuracy and step-size resilience. These observations suggest that symmetric splitting for mCCAdL is broadly applicable to noisy-gradient thermodynamic sampling in large-scale models and settings with substantial stochastic gradient variance (Wei et al., 30 Dec 2025).

Whiteboard

Topic to Video (Beta)

Follow Topic

Get notified by email when new papers are published related to Symmetric Splitting Method for mCCAdL.