Adjusted Langevin Correctors

Updated 4 July 2026

Adjusted Langevin Correctors are mechanisms that remove biases from Langevin samplers by correcting for discretization, score approximations, and geometric mismatches.
They employ techniques such as Metropolis adjustments, Fisher preconditioning, and trajectory-level corrections to maintain target invariance across varied sampling regimes.
These correctors find applications in Bayesian inverse problems, diffusion models, and discrete spaces, enhancing sampling accuracy and efficiency in both low and high dimensions.

Searching arXiv for the cited papers and closely related work to ground the article. Adjusted Langevin Correctors are correction mechanisms attached to Langevin-type samplers in order to remove or reduce the bias introduced by time discretization, score approximation, geometric mismatch, or support constraints. In the canonical continuous-state setting, the corrector is the Metropolis–Hastings accept/reject step that turns an Euler–Maruyama discretization of overdamped Langevin dynamics into a $\pi$ -invariant Markov chain. In broader usage, the term also covers geometry-aware preconditioning, score calibrations in discrete spaces, and weight-based or trajectory-level corrections that restore either exact stationarity or asymptotic exactness under more specialized sampling regimes (Wang et al., 12 Mar 2025, Gissler et al., 17 Feb 2026).

1. Metropolis adjustment as the canonical Langevin corrector

For a target density $\pi(x) \propto e^{-U(x)}$ on $\mathbb{R}^d$ , the overdamped Langevin diffusion

$dX_t = \frac{1}{2}\nabla \log \pi(X_t)\,dt + dW_t$

preserves $\pi$ as its invariant distribution under mild conditions. Its Euler–Maruyama discretization produces the unadjusted Langevin proposal

$y = x + \frac{h}{2}\nabla \log \pi(x) + \sqrt{h}\,\xi,\qquad \xi \sim \mathcal{N}(0,I),$

or, in a preconditioned form,

$y = x + \frac{h}{2} M \nabla \log \pi(x) + \sqrt{h}\,M^{1/2}\xi,$

with Gaussian proposal density

$q(x,y)=\mathcal{N}\!\big(y;\,x+\tfrac{h}{2}M\nabla\log\pi(x),\,hM\big).$

Because this discretization is only approximate, it generally does not leave $\pi$ invariant at finite step size. The Metropolis–Hastings correction,

$\alpha(x,y)=\min\left\{1,\frac{\pi(y)q(y,x)}{\pi(x)q(x,y)}\right\},$

restores reversibility and $\pi(x) \propto e^{-U(x)}$ 0-invariance. In this narrow but central sense, the adjusted Langevin corrector is precisely the Metropolis step (Wang et al., 12 Mar 2025, Biron-Lattes et al., 2023).

This interpretation extends naturally to position-dependent proposals. For position-dependent MALA, the drift must be chosen with care because a state-dependent metric changes the underlying reference measure. A corrected position-dependent Langevin diffusion with invariant density $\pi(x) \propto e^{-U(x)}$ 1 with respect to Lebesgue measure uses an additional drift term

$\pi(x) \propto e^{-U(x)}$ 2

so that the proposal covariance and the Metropolis ratio are consistent with the intended Euclidean target rather than the manifold volume measure (Xifara et al., 2013).

The non-asymptotic analysis of MALA also sharpened the role of the corrector. For log-smooth and strongly log-concave targets under a warm start, the mixing time of MALA is shown to scale as $\pi(x) \propto e^{-U(x)}$ 3, with the proof based on a projection characterization of the Metropolis adjustment that reduces the analysis to discretization control for the Langevin SDE rather than direct acceptance-probability calculations (Chewi et al., 2020).

2. Two-level correction: proposal geometry and Fisher adaptation

A more expansive interpretation distinguishes between two correction layers. The first is the Metropolis accept/reject step, which corrects discretization bias. The second is a geometric correction of the proposal itself, which aligns the drift–diffusion step with posterior curvature. This viewpoint is explicit in Fisher-adaptive MALA for Bayesian inverse problems, where the Metropolis correction is combined with Fisher-based preconditioning (Wang et al., 12 Mar 2025).

For Bayesian inverse problems with posterior

$\pi(x) \propto e^{-U(x)}$ 4

and in the Gaussian-noise/Gaussian-prior setting,

$\pi(x) \propto e^{-U(x)}$ 5

the score used throughout is

$\pi(x) \propto e^{-U(x)}$ 6

The Fisher preconditioner is built from the posterior score covariance

$\pi(x) \propto e^{-U(x)}$ 7

Under an expected squared jump distance criterion with a trace constraint, the optimal preconditioner satisfies

$\pi(x) \propto e^{-U(x)}$ 8

so inverse Fisher information is the optimal proposal-shaping matrix in this ESJD sense (Titsias, 2023).

The adaptive scheme estimates $\pi(x) \propto e^{-U(x)}$ 9 online through

$\mathbb{R}^d$ 0

or through the stochastic approximation recursion

$\mathbb{R}^d$ 1

with $\mathbb{R}^d$ 2 recovering the running average. Under ergodicity and bounded fourth moments,

$\mathbb{R}^d$ 3

which gives stable convergence of the learned preconditioner and asymptotic optimality of $\mathbb{R}^d$ 4 (Wang et al., 12 Mar 2025).

The full Fisher-adaptive MALA algorithm makes the two-level correction explicit. A standard MALA burn-in with $\mathbb{R}^d$ 5 tunes $\mathbb{R}^d$ 6 toward a target acceptance $\mathbb{R}^d$ 7. Each subsequent iteration computes the score, forms the preconditioned proposal, applies the Metropolis corrector, and updates the square-root factor $\mathbb{R}^d$ 8 of the preconditioner from the variance-reduced score increment

$\mathbb{R}^d$ 9

The Fisher update via rank-1 corrections is $dX_t = \frac{1}{2}\nabla \log \pi(X_t)\,dt + dW_t$ 0 per step, while gradients and forward solves dominate in PDE inverse problems. Empirically, this scheme yields higher ESJD, larger ESS, and lower ACF than covariance-adaptive MALA and pCN, especially in high dimensions; in linear-Gaussian problems, $dX_t = \frac{1}{2}\nabla \log \pi(X_t)\,dt + dW_t$ 1 coincides with the posterior covariance, while in nonlinear problems inverse Fisher better captures local curvature (Wang et al., 12 Mar 2025).

3. Adjusted scores as correctors in discrete state spaces

In discrete spaces, the obstacle is not only discretization of a continuous diffusion but also ambiguity in how to define a score compatible with the target distribution. For binary hypercubes $dX_t = \frac{1}{2}\nabla \log \pi(X_t)\,dt + dW_t$ 2, adjusted score functions provide the discrete analogue of Langevin correction by calibrating single-site flip probabilities so that the small-step limit recovers the exact Glauber dynamics of the target (Gissler et al., 17 Feb 2026).

For a strictly positive target $dX_t = \frac{1}{2}\nabla \log \pi(X_t)\,dt + dW_t$ 3, the Glauber score is defined coordinatewise by

$dX_t = \frac{1}{2}\nabla \log \pi(X_t)\,dt + dW_t$ 4

and the Gibbs score by

$dX_t = \frac{1}{2}\nabla \log \pi(X_t)\,dt + dW_t$ 5

For Ising models with energy

$dX_t = \frac{1}{2}\nabla \log \pi(X_t)\,dt + dW_t$ 6

one has $dX_t = \frac{1}{2}\nabla \log \pi(X_t)\,dt + dW_t$ 7, where $dX_t = \frac{1}{2}\nabla \log \pi(X_t)\,dt + dW_t$ 8 is the local field (Gissler et al., 17 Feb 2026).

These adjusted scores calibrate discrete Langevin kernels. For DULA, the proposal kernel is

$dX_t = \frac{1}{2}\nabla \log \pi(X_t)\,dt + dW_t$ 9

while for DUPS one uses a two-stage proximal construction

$\pi$ 0

With the Gibbs score in DULA, the small-step generator becomes the Glauber generator exactly in the limit $\pi$ 1; with the Glauber score in DUPS, the two-stage kernel approximates the proximal Gibbs sampler for the joint proximal distribution (Gissler et al., 17 Feb 2026).

Without Metropolis adjustment, these samplers are asymptotically correct in the small-step regime. With adjusted scores, the paper gives explicit stationary error bounds such as

$\pi$ 2

under $\pi$ 3 and $\pi$ 4, and

$\pi$ 5

under the stated regularity assumptions, both vanishing as $\pi$ 6. If exact stationarity is required, a Metropolis step can be added. For two-stage kernels,

$\pi$ 7

which restores detailed balance exactly. This establishes a discrete analogue of the adjusted Langevin corrector: the adjusted score provides asymptotic exactness, while the Metropolis or Barker filter provides exact invariance for any step size (Gissler et al., 17 Feb 2026).

4. Correctors on constrained and non-Euclidean domains

When the target support is a proper convex subset of $\pi$ 8, or more generally when the geometry is non-Euclidean, the corrector must account for both discretization and feasibility. A representative construction is the Metropolis-adjusted Preconditioned Langevin Algorithm for constrained domains, where the proposal is

$\pi$ 9

with Gaussian density

$y = x + \frac{h}{2}\nabla \log \pi(x) + \sqrt{h}\,\xi,\qquad \xi \sim \mathcal{N}(0,I),$ 0

If $y = x + \frac{h}{2}\nabla \log \pi(x) + \sqrt{h}\,\xi,\qquad \xi \sim \mathcal{N}(0,I),$ 1, the move is rejected immediately. The Metropolis acceptance probability is

$y = x + \frac{h}{2}\nabla \log \pi(x) + \sqrt{h}\,\xi,\qquad \xi \sim \mathcal{N}(0,I),$ 2

and its logarithm contains both an energy difference and a metric-determinant correction (Srinivasan et al., 2024).

Here the corrector has a specific role: the exact weighted Langevin dynamics on the manifold $y = x + \frac{h}{2}\nabla \log \pi(x) + \sqrt{h}\,\xi,\qquad \xi \sim \mathcal{N}(0,I),$ 3 includes a divergence drift $y = x + \frac{h}{2}\nabla \log \pi(x) + \sqrt{h}\,\xi,\qquad \xi \sim \mathcal{N}(0,I),$ 4, whereas the proposal omits the divergence term. The Metropolis step compensates for that omission and for boundary truncation, restoring detailed balance with respect to $y = x + \frac{h}{2}\nabla \log \pi(x) + \sqrt{h}\,\xi,\qquad \xi \sim \mathcal{N}(0,I),$ 5. Under self-concordance, $y = x + \frac{h}{2}\nabla \log \pi(x) + \sqrt{h}\,\xi,\qquad \xi \sim \mathcal{N}(0,I),$ 6-symmetry, and relative bounds on $y = x + \frac{h}{2}\nabla \log \pi(x) + \sqrt{h}\,\xi,\qquad \xi \sim \mathcal{N}(0,I),$ 7, the paper derives non-asymptotic warm-start mixing bounds. Under standard self-concordance,

$y = x + \frac{h}{2}\nabla \log \pi(x) + \sqrt{h}\,\xi,\qquad \xi \sim \mathcal{N}(0,I),$ 8

for $y = x + \frac{h}{2}\nabla \log \pi(x) + \sqrt{h}\,\xi,\qquad \xi \sim \mathcal{N}(0,I),$ 9, and under stronger SC++ assumptions the admissible step size improves from

$y = x + \frac{h}{2} M \nabla \log \pi(x) + \sqrt{h}\,M^{1/2}\xi,$ 0

$y = x + \frac{h}{2} M \nabla \log \pi(x) + \sqrt{h}\,M^{1/2}\xi,$ 1

The polylogarithmic dependence on the error tolerance is one reason the method is described as a high-accuracy sampler (Srinivasan et al., 2024).

Mirror-Langevin sampling supplies a parallel construction for compact convex sets endowed with a self-concordant mirror $y = x + \frac{h}{2} M \nabla \log \pi(x) + \sqrt{h}\,M^{1/2}\xi,$ 2. In dual coordinates the unadjusted proposal is

$y = x + \frac{h}{2} M \nabla \log \pi(x) + \sqrt{h}\,M^{1/2}\xi,$ 3

with proposal density

$y = x + \frac{h}{2} M \nabla \log \pi(x) + \sqrt{h}\,M^{1/2}\xi,$ 4

Adding the Metropolis filter yields MAMLA, which is unbiased relative to the target. Under relative convexity, relative smoothness, and relative Lipschitz assumptions, its warm-start mixing time satisfies

$y = x + \frac{h}{2} M \nabla \log \pi(x) + \sqrt{h}\,M^{1/2}\xi,$ 5

in the strongly convex case, and

$y = x + \frac{h}{2} M \nabla \log \pi(x) + \sqrt{h}\,M^{1/2}\xi,$ 6

in the weakly convex case with a symmetric barrier (Srinivasan et al., 2023).

5. Adaptive, trajectory-level, and second-order correctors

Not all adjusted Langevin correctors are static Metropolis filters attached to a fixed Euler proposal. Several recent variants modify the proposal map itself while preserving an exact correction principle.

autoMALA chooses the step size locally at each iteration. It represents MALA as one leapfrog step in an augmented HMC-like space with momentum $y = x + \frac{h}{2} M \nabla \log \pi(x) + \sqrt{h}\,M^{1/2}\xi,$ 7 and target

$y = x + \frac{h}{2} M \nabla \log \pi(x) + \sqrt{h}\,M^{1/2}\xi,$ 8

A deterministic routine doubles or halves $y = x + \frac{h}{2} M \nabla \log \pi(x) + \sqrt{h}\,M^{1/2}\xi,$ 9 until the augmented log-density change under a leapfrog step falls into a random acceptance window $q(x,y)=\mathcal{N}\!\big(y;\,x+\tfrac{h}{2}M\nabla\log\pi(x),\,hM\big).$ 0, and a reversibility check rejects whenever the forward and reverse selectors disagree on the step size. The final Metropolis step

$q(x,y)=\mathcal{N}\!\big(y;\,x+\tfrac{h}{2}M\nabla\log\pi(x),\,hM\big).$ 1

then preserves the correct invariant distribution despite continual local step-size adaptation. This corrector addresses spatially varying geometry without introducing Hessians into the tuning rule (Biron-Lattes et al., 2023).

A different second-order route appears in Hessian-corrected MALA. There, the Langevin drift is linearized using

$q(x,y)=\mathcal{N}\!\big(y;\,x+\tfrac{h}{2}M\nabla\log\pi(x),\,hM\big).$ 2

and the resulting affine SDE is solved exactly over one step. The proposal is Gaussian,

$q(x,y)=\mathcal{N}\!\big(y;\,x+\tfrac{h}{2}M\nabla\log\pi(x),\,hM\big).$ 3

with

$q(x,y)=\mathcal{N}\!\big(y;\,x+\tfrac{h}{2}M\nabla\log\pi(x),\,hM\big).$ 4

followed by the usual Metropolis correction. This construction uses the Hessian as a local proposal corrector before the MH filter restores exactness (House, 2015).

Trajectory-level correctors arise in kinetic Langevin sampling. In Metropolis Adjusted Langevin Trajectories, one simulates a discretized kinetic Langevin path of $q(x,y)=\mathcal{N}\!\big(y;\,x+\tfrac{h}{2}M\nabla\log\pi(x),\,hM\big).$ 5 leapfrog-like substeps with partial OU refreshment and then applies a single acceptance test based on the total accumulated energy defect

$q(x,y)=\mathcal{N}\!\big(y;\,x+\tfrac{h}{2}M\nabla\log\pi(x),\,hM\big).$ 6

This differs from GHMC, which applies a Metropolis test at each substep and flips momentum on rejection. By correcting the whole trajectory at once, MALT avoids backtracking and attains larger step sizes. For isotropic targets it has the same optimal high-dimensional scaling as HMC, namely $q(x,y)=\mathcal{N}\!\big(y;\,x+\tfrac{h}{2}M\nabla\log\pi(x),\,hM\big).$ 7 with optimal acceptance about $q(x,y)=\mathcal{N}\!\big(y;\,x+\tfrac{h}{2}M\nabla\log\pi(x),\,hM\big).$ 8 (Riou-Durand et al., 2022). Adaptive MALT subsequently tunes the step size, damping, and trajectory length by optimizing an ESS surrogate, with the step size adapted through the synthetic gradient

$q(x,y)=\mathcal{N}\!\big(y;\,x+\tfrac{h}{2}M\nabla\log\pi(x),\,hM\big).$ 9

and default target $\pi$ 0 (Riou-Durand et al., 2022).

A related but distinct development is the gradient-adjusted underdamped Langevin dynamics family, which changes the SDE itself by inserting a gradient-adjusted term in the position dynamics and a nonreversible matrix $\pi$ 1. For Gaussian targets, its Euler–Maruyama discretization reaches a biased target in a number of steps depending on $\pi$ 2 rather than $\pi$ 3, suggesting that some “correctors” operate at the level of the drift–diffusion structure rather than solely at the Metropolis layer (Zuo et al., 2024).

6. Correctors in diffusion models, latent-variable learning, and current debates

In score-based diffusion models, predictor–corrector samplers usually employ unadjusted Langevin steps as correctors at each noise level, even though ULA is itself biased. Metropolis-Adjusted Diffusion Models replace those correctors with accept/reject steps computed from the score alone. The key identity is

$\pi$ 4

which makes it possible to express the Metropolis or Barker ratio without explicit access to $\pi$ 5. The paper introduces an exact Barker corrector based on a two-coin Bernoulli factory and an efficient Simpson-rule approximation whose error is of order $\pi$ 6 in the step size. On image datasets, these adjusted correctors yield consistent FID improvements, including ImageNet-64 gains from $\pi$ 7 to $\pi$ 8 under Heun PF-ODE and from $\pi$ 9 to $\alpha(x,y)=\min\left\{1,\frac{\pi(y)q(y,x)}{\pi(x)q(x,y)}\right\},$ 0 under Euler PF-ODE (Lam et al., 10 May 2026).

A different correction principle is needed for evolving targets. Jarzynski-adjusted Langevin uses a weighted ULA path with recursively updated log-weights

$\alpha(x,y)=\min\left\{1,\frac{\pi(y)q(y,x)}{\pi(x)q(x,y)}\right\},$ 1

where

$\alpha(x,y)=\min\left\{1,\frac{\pi(y)q(y,x)}{\pi(x)q(x,y)}\right\},$ 2

This produces the exact identity

$\alpha(x,y)=\min\left\{1,\frac{\pi(y)q(y,x)}{\pi(x)q(x,y)}\right\},$ 3

so the corrector is not an accept/reject mechanism but an importance-weight adjustment rooted in Jarzynski’s equality. In latent-variable models this yields JALA-EM, an SMC-based learning procedure with nonasymptotic convergence guarantees under PL or strong-convexity assumptions (Cuin et al., 23 May 2025).

Several recurrent misconceptions are clarified by this body of work. One is that the only legitimate Langevin corrector is the Metropolis step. The literature instead supports a broader taxonomy: Metropolis correction for exact invariance in continuous spaces, adjusted scores for discrete Glauber limits, trajectory-level correction in kinetic schemes, and Jarzynski weights for evolving targets. A second misconception is that covariance adaptation is always the natural geometry correction. In linear-Gaussian inverse problems inverse Fisher and posterior covariance coincide, but in nonlinear problems inverse Fisher is the principled ESJD-optimal choice and empirically yields higher ESJD, ESS, and lower ACF than covariance-adaptive MALA (Wang et al., 12 Mar 2025).

Open questions remain explicit in the papers. In discrete sampling, robustness of adjusted-score correctors to learned score error and extensions beyond product-state hypercubes are open directions (Gissler et al., 17 Feb 2026). In constrained sampling, the effect of omitting the divergence term $\alpha(x,y)=\min\left\{1,\frac{\pi(y)q(y,x)}{\pi(x)q(x,y)}\right\},$ 4 on mixing-rate sharpness, and feasibility-preserving first-order alternatives to reject-if-outside proposals, remain unresolved (Srinivasan et al., 2024). For autoMALA, irreducibility is not yet proven, and the paper recommends mixing with an irreducible kernel if that guarantee is needed (Biron-Lattes et al., 2023). In diffusion models, exact corrected samplers are now available in principle, but their practical behavior under score misspecification remains an active question (Lam et al., 10 May 2026).

Across these variants, the unifying theme is stable: a Langevin proposal is treated as a predictor, and an additional mechanism corrects either its invariant measure, its geometry, or its pathwise bias. The resulting theory no longer treats “adjustment” as a single algorithmic trick, but as a family of correction principles adapted to continuous, discrete, constrained, adaptive, and nonequilibrium sampling regimes.