Twisted Diffusion Sampler (TDS)

Updated 15 April 2026

Twisted Diffusion Sampler (TDS) is a sequential Monte Carlo algorithm that employs twisting to enable asymptotically exact conditional sampling from unconditional diffusion models.
It leverages weighted particles and time-dependent potentials in a reverse diffusion process to efficiently incorporate conditioning information and converge to the true posterior.
TDS has demonstrated significant empirical improvements in tasks such as image inpainting, class-conditional generation, and protein motif-scaffolding, and it extends to Riemannian state spaces.

The Twisted Diffusion Sampler (TDS) is a sequential Monte Carlo (SMC) algorithm enabling practical and asymptotically exact conditional sampling from distributions induced by unconditional diffusion models. Unlike prior approaches that depend on task-specific conditional training or heuristic approximations, TDS leverages SMC principles and the method of “twisting” to realize flexible and accurate conditional generation without retraining diffusion networks. The technique operates by simulating a set of weighted particles through the reverse diffusion chain, where twisting incorporates conditioning information and ensures convergence to the true posterior as the number of particles increases. TDS applies to both Euclidean and Riemannian state spaces and has demonstrated empirical improvements over existing heuristics in image inpainting, class-conditional image generation, and motif-scaffolding for protein design (Wu et al., 2023).

1. Conditional Sampling in Diffusion Models

Given an unconditional diffusion model with a forward process $x_0 \sim p(x_0)$ and reverse transitions

$x_t|x_{t+1} \sim \mathcal{N}(x_t; x_{t+1}+\sigma_{t+1}^2 s_\theta(x_{t+1}, t+1), \sigma_{t+1}^2 I)$

for $t = T-1, \ldots, 0$ , conditional generation aims to sample from the posterior $p(x_0|y) \propto p(x_0) \ell(y|x_0)$ , where $y$ denotes observations and $\ell(y|x_0)$ is a likelihood term. The Markov chain $x_0 \rightarrow x_1 \rightarrow \ldots \rightarrow x_T$ facilitates representing the joint as

$p(x_{0:T}, y) = p(x_0) \prod_{t=0}^{T-1} p(x_{t+1}|x_t) \ell(y|x_0).$

This formulation reduces conditional sampling to approximating the marginal $p(x_0|y)$ , typically intractable due to the high-dimensional latent space and the complex form of $\ell(y|x_0)$ .

2. The Twisting Principle in SMC

Twisting is employed within SMC to progressively introduce conditioning information via time-dependent potentials $x_t|x_{t+1} \sim \mathcal{N}(x_t; x_{t+1}+\sigma_{t+1}^2 s_\theta(x_{t+1}, t+1), \sigma_{t+1}^2 I)$ 0, approximating $x_t|x_{t+1} \sim \mathcal{N}(x_t; x_{t+1}+\sigma_{t+1}^2 s_\theta(x_{t+1}, t+1), \sigma_{t+1}^2 I)$ 1. For each step $x_t|x_{t+1} \sim \mathcal{N}(x_t; x_{t+1}+\sigma_{t+1}^2 s_\theta(x_{t+1}, t+1), \sigma_{t+1}^2 I)$ 2, the twisted proposal is defined as

$x_t|x_{t+1} \sim \mathcal{N}(x_t; x_{t+1}+\sigma_{t+1}^2 s_\theta(x_{t+1}, t+1), \sigma_{t+1}^2 I)$ 3

and the corresponding twisted importance weight is

$x_t|x_{t+1} \sim \mathcal{N}(x_t; x_{t+1}+\sigma_{t+1}^2 s_\theta(x_{t+1}, t+1), \sigma_{t+1}^2 I)$ 4

This construction allows the SMC chain to interpolate between the original unconditional process and the conditional target, propagating likelihood information backward through the chain. In continuous time, twisting alters the drift in the reverse SDE:

$x_t|x_{t+1} \sim \mathcal{N}(x_t; x_{t+1}+\sigma_{t+1}^2 s_\theta(x_{t+1}, t+1), \sigma_{t+1}^2 I)$ 5

preserving the diffusion coefficient.

3. Twisted Diffusion Sampler Algorithm

The TDS proceeds in discrete time with $x_t|x_{t+1} \sim \mathcal{N}(x_t; x_{t+1}+\sigma_{t+1}^2 s_\theta(x_{t+1}, t+1), \sigma_{t+1}^2 I)$ 6 particles and a time horizon of $x_t|x_{t+1} \sim \mathcal{N}(x_t; x_{t+1}+\sigma_{t+1}^2 s_\theta(x_{t+1}, t+1), \sigma_{t+1}^2 I)$ 7 steps:

Initialization: For each particle $x_t|x_{t+1} \sim \mathcal{N}(x_t; x_{t+1}+\sigma_{t+1}^2 s_\theta(x_{t+1}, t+1), \sigma_{t+1}^2 I)$ 8, sample $x_t|x_{t+1} \sim \mathcal{N}(x_t; x_{t+1}+\sigma_{t+1}^2 s_\theta(x_{t+1}, t+1), \sigma_{t+1}^2 I)$ 9 and set $t = T-1, \ldots, 0$ 0.
Reverse Propagation (for $t = T-1, \ldots, 0$ $t = T - 1, \dots, 0$ 1 down to $t = T-1, \ldots, 0$ $t = T - 1, \dots, 0$ 2):
- Resampling: Compute effective sample size (ESS); if $t = T-1, \ldots, 0$ 3, resample.
- Particle update: For each $t = T-1, \ldots, 0$ 4, sample $t = T-1, \ldots, 0$ 5 and update
$t = T-1, \ldots, 0$ 6
Output: The empirical measure $t = T-1, \ldots, 0$ 7 approximates $t = T-1, \ldots, 0$ 8.

This procedure provably converges to the exact posterior as $t = T-1, \ldots, 0$ 9, given appropriate regularity conditions on the twisting functions, proposal support, and resampling threshold.

4. Asymptotic Exactness and Theoretical Guarantees

TDS inherits the asymptotic exactness of SMC under regularity assumptions: bounded and positive twisting functions, proposal distributions with full support, and resampling thresholds $p(x_0|y) \propto p(x_0) \ell(y|x_0)$ 0. For any bounded test function $p(x_0|y) \propto p(x_0) \ell(y|x_0)$ 1,

$p(x_0|y) \propto p(x_0) \ell(y|x_0)$ 2

with probability one. The empirical distribution of weighted samples converges setwise to the posterior $p(x_0|y) \propto p(x_0) \ell(y|x_0)$ 3. This property follows from SMC theory and ensures that any estimator formed from weighted particles is consistent as $p(x_0|y) \propto p(x_0) \ell(y|x_0)$ 4 increases (Wu et al., 2023).

5. Empirical Performance and Computational Trade-offs

In synthetic and real data settings, TDS displays favorable computational-statistical trade-offs. In 2D Gaussian settings with known likelihood, the error in estimating $p(x_0|y) \propto p(x_0) \ell(y|x_0)$ 5 using TDS decreases as $p(x_0|y) \propto p(x_0) \ell(y|x_0)$ 6, while guidance-only and naive importance sampling methods require exponentially more particles with respect to the KL divergence. On MNIST class-conditional generation tasks ( $p(x_0|y) \propto p(x_0) \ell(y|x_0)$ 7, pretrained ResNet-50 likelihood), $p(x_0|y) \propto p(x_0) \ell(y|x_0)$ 8 TDS particles outperform reconstruction guidance, and $p(x_0|y) \propto p(x_0) \ell(y|x_0)$ 9 achieves near-perfect classification accuracy. For MNIST inpainting tasks, TDS achieves higher Bayes accuracy and effective sample size (ESS) than prior SMC-Diff and heuristic replacement schemes. In motif-scaffolding for proteins (FrameDiff model, $y$ 0), $y$ 1 particles increase in silico success rates (measured by AlphaFold+ProteinMPNN self-consistency) by up to $y$ 2 over $y$ 3, matching or surpassing RFdiffusion performance on short scaffolds.

6. Extension to Riemannian Diffusion Models

TDS generalizes to Riemannian state spaces relevant for geometric learning tasks, such as SE(3) $y$ 4 for protein backbone design. The forward process employs Variance Exploding (VE) noise within tangent spaces, and inference relies on Tangent-Normal Gaussian kernels mapped via the exponential map. Conditioning (e.g., on a motif) is imposed by defining the twisting functions using Tangent-Normal densities,

$y$ 5

where motif placement, global rotation, or degrees of freedom are accommodated by summing $y$ 6 over appropriate submanifold choices. The cancellation of Jacobian terms in the $y$ 7 weight ratio enables efficient application of TDS in manifold settings.

7. Implementation Considerations and Benchmarking Methodology

Practical deployment of TDS involves configuring parameters such as the number of reverse steps ( $y$ 8), resampling schedule, and particle count ( $y$ 9). For MNIST, $\ell(y|x_0)$ 0 with VE or VP schedules; for proteins, $\ell(y|x_0)$ 1. Systematic resampling is triggered at ESS $\ell(y|x_0)$ 2. Sharpening conditioning is achievable by exponentiating twisting functions as $\ell(y|x_0)$ 3 with $\ell(y|x_0)$ 4, at the risk of distorted samples for excessive $\ell(y|x_0)$ 5. Benchmarks utilize metrics such as:

Effective Sample Size (ESS)
Classification accuracy (CNN/human-annotated) for MNIST
Bayes accuracy from weighted samples
Protein design success rate (AlphaFold scRMSD below threshold)

In protein motif-scaffolding, $\ell(y|x_0)$ 6 particles yield $\ell(y|x_0)$ 7 higher in silico rates. Uniform subsampling over $\ell(y|x_0)$ 8 motif placements and $\ell(y|x_0)$ 9 global rotations imposes negligible cost, leveraging network sharing across $x_0 \rightarrow x_1 \rightarrow \ldots \rightarrow x_T$ 0 evaluations.

TDS thus combines SMC-twisting theory with efficient conditional sampling in high-dimensional, structured data settings, achieving both practical and asymptotically exact performance across Euclidean and geometric state spaces (Wu et al., 2023).

Markdown Report Issue Upgrade to Chat

References (1)

Practical and Asymptotically Exact Conditional Sampling in Diffusion Models (2023)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Twisted Diffusion Sampler (TDS).

Twisted Diffusion Sampler (TDS)

1. Conditional Sampling in Diffusion Models

2. The Twisting Principle in SMC

3. Twisted Diffusion Sampler Algorithm

4. Asymptotic Exactness and Theoretical Guarantees

5. Empirical Performance and Computational Trade-offs

6. Extension to Riemannian Diffusion Models

7. Implementation Considerations and Benchmarking Methodology

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Twisted Diffusion Sampler (TDS)

1. Conditional Sampling in Diffusion Models

2. The Twisting Principle in SMC

3. Twisted Diffusion Sampler Algorithm

4. Asymptotic Exactness and Theoretical Guarantees

5. Empirical Performance and Computational Trade-offs

6. Extension to Riemannian Diffusion Models

7. Implementation Considerations and Benchmarking Methodology

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research