Papers
Topics
Authors
Recent
Search
2000 character limit reached

Twisted Diffusion Sampler (TDS)

Updated 15 April 2026
  • Twisted Diffusion Sampler (TDS) is a sequential Monte Carlo algorithm that employs twisting to enable asymptotically exact conditional sampling from unconditional diffusion models.
  • It leverages weighted particles and time-dependent potentials in a reverse diffusion process to efficiently incorporate conditioning information and converge to the true posterior.
  • TDS has demonstrated significant empirical improvements in tasks such as image inpainting, class-conditional generation, and protein motif-scaffolding, and it extends to Riemannian state spaces.

The Twisted Diffusion Sampler (TDS) is a sequential Monte Carlo (SMC) algorithm enabling practical and asymptotically exact conditional sampling from distributions induced by unconditional diffusion models. Unlike prior approaches that depend on task-specific conditional training or heuristic approximations, TDS leverages SMC principles and the method of “twisting” to realize flexible and accurate conditional generation without retraining diffusion networks. The technique operates by simulating a set of weighted particles through the reverse diffusion chain, where twisting incorporates conditioning information and ensures convergence to the true posterior as the number of particles increases. TDS applies to both Euclidean and Riemannian state spaces and has demonstrated empirical improvements over existing heuristics in image inpainting, class-conditional image generation, and motif-scaffolding for protein design (Wu et al., 2023).

1. Conditional Sampling in Diffusion Models

Given an unconditional diffusion model with a forward process x0p(x0)x_0 \sim p(x_0) and reverse transitions

xtxt+1N(xt;xt+1+σt+12sθ(xt+1,t+1),σt+12I)x_t|x_{t+1} \sim \mathcal{N}(x_t; x_{t+1}+\sigma_{t+1}^2 s_\theta(x_{t+1}, t+1), \sigma_{t+1}^2 I)

for t=T1,,0t = T-1, \ldots, 0, conditional generation aims to sample from the posterior p(x0y)p(x0)(yx0)p(x_0|y) \propto p(x_0) \ell(y|x_0), where yy denotes observations and (yx0)\ell(y|x_0) is a likelihood term. The Markov chain x0x1xTx_0 \rightarrow x_1 \rightarrow \ldots \rightarrow x_T facilitates representing the joint as

p(x0:T,y)=p(x0)t=0T1p(xt+1xt)(yx0).p(x_{0:T}, y) = p(x_0) \prod_{t=0}^{T-1} p(x_{t+1}|x_t) \ell(y|x_0).

This formulation reduces conditional sampling to approximating the marginal p(x0y)p(x_0|y), typically intractable due to the high-dimensional latent space and the complex form of (yx0)\ell(y|x_0).

2. The Twisting Principle in SMC

Twisting is employed within SMC to progressively introduce conditioning information via time-dependent potentials xtxt+1N(xt;xt+1+σt+12sθ(xt+1,t+1),σt+12I)x_t|x_{t+1} \sim \mathcal{N}(x_t; x_{t+1}+\sigma_{t+1}^2 s_\theta(x_{t+1}, t+1), \sigma_{t+1}^2 I)0, approximating xtxt+1N(xt;xt+1+σt+12sθ(xt+1,t+1),σt+12I)x_t|x_{t+1} \sim \mathcal{N}(x_t; x_{t+1}+\sigma_{t+1}^2 s_\theta(x_{t+1}, t+1), \sigma_{t+1}^2 I)1. For each step xtxt+1N(xt;xt+1+σt+12sθ(xt+1,t+1),σt+12I)x_t|x_{t+1} \sim \mathcal{N}(x_t; x_{t+1}+\sigma_{t+1}^2 s_\theta(x_{t+1}, t+1), \sigma_{t+1}^2 I)2, the twisted proposal is defined as

xtxt+1N(xt;xt+1+σt+12sθ(xt+1,t+1),σt+12I)x_t|x_{t+1} \sim \mathcal{N}(x_t; x_{t+1}+\sigma_{t+1}^2 s_\theta(x_{t+1}, t+1), \sigma_{t+1}^2 I)3

and the corresponding twisted importance weight is

xtxt+1N(xt;xt+1+σt+12sθ(xt+1,t+1),σt+12I)x_t|x_{t+1} \sim \mathcal{N}(x_t; x_{t+1}+\sigma_{t+1}^2 s_\theta(x_{t+1}, t+1), \sigma_{t+1}^2 I)4

This construction allows the SMC chain to interpolate between the original unconditional process and the conditional target, propagating likelihood information backward through the chain. In continuous time, twisting alters the drift in the reverse SDE:

xtxt+1N(xt;xt+1+σt+12sθ(xt+1,t+1),σt+12I)x_t|x_{t+1} \sim \mathcal{N}(x_t; x_{t+1}+\sigma_{t+1}^2 s_\theta(x_{t+1}, t+1), \sigma_{t+1}^2 I)5

preserving the diffusion coefficient.

3. Twisted Diffusion Sampler Algorithm

The TDS proceeds in discrete time with xtxt+1N(xt;xt+1+σt+12sθ(xt+1,t+1),σt+12I)x_t|x_{t+1} \sim \mathcal{N}(x_t; x_{t+1}+\sigma_{t+1}^2 s_\theta(x_{t+1}, t+1), \sigma_{t+1}^2 I)6 particles and a time horizon of xtxt+1N(xt;xt+1+σt+12sθ(xt+1,t+1),σt+12I)x_t|x_{t+1} \sim \mathcal{N}(x_t; x_{t+1}+\sigma_{t+1}^2 s_\theta(x_{t+1}, t+1), \sigma_{t+1}^2 I)7 steps:

  1. Initialization: For each particle xtxt+1N(xt;xt+1+σt+12sθ(xt+1,t+1),σt+12I)x_t|x_{t+1} \sim \mathcal{N}(x_t; x_{t+1}+\sigma_{t+1}^2 s_\theta(x_{t+1}, t+1), \sigma_{t+1}^2 I)8, sample xtxt+1N(xt;xt+1+σt+12sθ(xt+1,t+1),σt+12I)x_t|x_{t+1} \sim \mathcal{N}(x_t; x_{t+1}+\sigma_{t+1}^2 s_\theta(x_{t+1}, t+1), \sigma_{t+1}^2 I)9 and set t=T1,,0t = T-1, \ldots, 00.
  2. Reverse Propagation (for t=T1,,0t = T-1, \ldots, 01 down to t=T1,,0t = T-1, \ldots, 02):

    • Resampling: Compute effective sample size (ESS); if t=T1,,0t = T-1, \ldots, 03, resample.
    • Particle update: For each t=T1,,0t = T-1, \ldots, 04, sample t=T1,,0t = T-1, \ldots, 05 and update

    t=T1,,0t = T-1, \ldots, 06

  3. Output: The empirical measure t=T1,,0t = T-1, \ldots, 07 approximates t=T1,,0t = T-1, \ldots, 08.

This procedure provably converges to the exact posterior as t=T1,,0t = T-1, \ldots, 09, given appropriate regularity conditions on the twisting functions, proposal support, and resampling threshold.

4. Asymptotic Exactness and Theoretical Guarantees

TDS inherits the asymptotic exactness of SMC under regularity assumptions: bounded and positive twisting functions, proposal distributions with full support, and resampling thresholds p(x0y)p(x0)(yx0)p(x_0|y) \propto p(x_0) \ell(y|x_0)0. For any bounded test function p(x0y)p(x0)(yx0)p(x_0|y) \propto p(x_0) \ell(y|x_0)1,

p(x0y)p(x0)(yx0)p(x_0|y) \propto p(x_0) \ell(y|x_0)2

with probability one. The empirical distribution of weighted samples converges setwise to the posterior p(x0y)p(x0)(yx0)p(x_0|y) \propto p(x_0) \ell(y|x_0)3. This property follows from SMC theory and ensures that any estimator formed from weighted particles is consistent as p(x0y)p(x0)(yx0)p(x_0|y) \propto p(x_0) \ell(y|x_0)4 increases (Wu et al., 2023).

5. Empirical Performance and Computational Trade-offs

In synthetic and real data settings, TDS displays favorable computational-statistical trade-offs. In 2D Gaussian settings with known likelihood, the error in estimating p(x0y)p(x0)(yx0)p(x_0|y) \propto p(x_0) \ell(y|x_0)5 using TDS decreases as p(x0y)p(x0)(yx0)p(x_0|y) \propto p(x_0) \ell(y|x_0)6, while guidance-only and naive importance sampling methods require exponentially more particles with respect to the KL divergence. On MNIST class-conditional generation tasks (p(x0y)p(x0)(yx0)p(x_0|y) \propto p(x_0) \ell(y|x_0)7, pretrained ResNet-50 likelihood), p(x0y)p(x0)(yx0)p(x_0|y) \propto p(x_0) \ell(y|x_0)8 TDS particles outperform reconstruction guidance, and p(x0y)p(x0)(yx0)p(x_0|y) \propto p(x_0) \ell(y|x_0)9 achieves near-perfect classification accuracy. For MNIST inpainting tasks, TDS achieves higher Bayes accuracy and effective sample size (ESS) than prior SMC-Diff and heuristic replacement schemes. In motif-scaffolding for proteins (FrameDiff model, yy0), yy1 particles increase in silico success rates (measured by AlphaFold+ProteinMPNN self-consistency) by up to yy2 over yy3, matching or surpassing RFdiffusion performance on short scaffolds.

6. Extension to Riemannian Diffusion Models

TDS generalizes to Riemannian state spaces relevant for geometric learning tasks, such as SE(3)yy4 for protein backbone design. The forward process employs Variance Exploding (VE) noise within tangent spaces, and inference relies on Tangent-Normal Gaussian kernels mapped via the exponential map. Conditioning (e.g., on a motif) is imposed by defining the twisting functions using Tangent-Normal densities,

yy5

where motif placement, global rotation, or degrees of freedom are accommodated by summing yy6 over appropriate submanifold choices. The cancellation of Jacobian terms in the yy7 weight ratio enables efficient application of TDS in manifold settings.

7. Implementation Considerations and Benchmarking Methodology

Practical deployment of TDS involves configuring parameters such as the number of reverse steps (yy8), resampling schedule, and particle count (yy9). For MNIST, (yx0)\ell(y|x_0)0 with VE or VP schedules; for proteins, (yx0)\ell(y|x_0)1. Systematic resampling is triggered at ESS(yx0)\ell(y|x_0)2. Sharpening conditioning is achievable by exponentiating twisting functions as (yx0)\ell(y|x_0)3 with (yx0)\ell(y|x_0)4, at the risk of distorted samples for excessive (yx0)\ell(y|x_0)5. Benchmarks utilize metrics such as:

  • Effective Sample Size (ESS)
  • Classification accuracy (CNN/human-annotated) for MNIST
  • Bayes accuracy from weighted samples
  • Protein design success rate (AlphaFold scRMSD below threshold)

In protein motif-scaffolding, (yx0)\ell(y|x_0)6 particles yield (yx0)\ell(y|x_0)7 higher in silico rates. Uniform subsampling over (yx0)\ell(y|x_0)8 motif placements and (yx0)\ell(y|x_0)9 global rotations imposes negligible cost, leveraging network sharing across x0x1xTx_0 \rightarrow x_1 \rightarrow \ldots \rightarrow x_T0 evaluations.

TDS thus combines SMC-twisting theory with efficient conditional sampling in high-dimensional, structured data settings, achieving both practical and asymptotically exact performance across Euclidean and geometric state spaces (Wu et al., 2023).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Twisted Diffusion Sampler (TDS).