Score-based constrained generative modeling via Langevin diffusions with boundary conditions

Published 28 Oct 2025 in stat.ML, cs.LG, cs.NA, and math.NA | (2510.23985v1)

Abstract: Score-based generative models based on stochastic differential equations (SDEs) achieve impressive performance in sampling from unknown distributions, but often fail to satisfy underlying constraints. We propose a constrained generative model using kinetic (underdamped) Langevin dynamics with specular reflection of velocity on the boundary defining constraints. This results in piecewise continuously differentiable noising and denoising process where the latter is characterized by a time-reversed dynamics restricted to a domain with boundary due to specular boundary condition. In addition, we also contribute to existing reflected SDEs based constrained generative models, where the stochastic dynamics is restricted through an abstract local time term. By presenting efficient numerical samplers which converge with optimal rate in terms of discretizations step, we provide a comprehensive comparison of models based on confined (specularly reflected kinetic) Langevin diffusion with models based on reflected diffusion with local time.

Abstract PDF Upgrade to Chat

Summary

The paper introduces a novel score-based diffusion model using kinetic Langevin dynamics with specular reflection, ensuring strict adherence to predefined constraints.
It compares multiple numerical schemes, including splitting methods for confined Langevin and reflected overdamped dynamics, showing superior sample quality with zero constraint violations and improved metrics (MMD and FID).
The study provides tractable score matching loss formulations and rigorous theoretical guarantees, opening new avenues for applying constrained generative modeling in high-dimensional domains.

Score-Based Constrained Generative Modeling via Langevin Diffusions with Boundary Conditions

Overview and Motivation

This paper addresses the challenge of generative modeling under explicit constraints, a scenario where standard score-based diffusion models often fail to respect domain boundaries due to the stochastic nature of their sampling processes. The authors propose a novel framework based on kinetic (underdamped) Langevin dynamics with specular reflection at the boundary, ensuring strict adherence to constraints during both training and sampling. The work also provides a comprehensive computational and theoretical comparison with reflected (overdamped) Langevin models, including a detailed analysis of numerical schemes and loss functions for score matching in constrained domains.

Mathematical Formulation

Confined Kinetic Langevin Dynamics

The core innovation is the use of kinetic Langevin SDEs with specular reflection, where the position $X_t$ and velocity $V_t$ evolve as:

$\begin{aligned} X_t &= X_0 + \int_0^t V_s ds, \ V_t &= V_0 + \int_0^t b(X_s) ds - \int_0^t \gamma V_s ds + \int_0^t \sqrt{2\gamma} dW_s - 2 \sum_{0 < s \leq t} \langle n(X_s), V_s \rangle n(X_s) I_{\partial G}(X_s), \end{aligned}$

where %%%%2%%%% is the constraint set, $n(x)$ is the outward normal at $x \in \partial G$ , and specular reflection ensures the velocity is reflected at the boundary, preserving the angle of incidence.

The time-reversed dynamics are derived via the Fokker-Planck equation with specular boundary conditions, yielding a reverse SDE for sampling that strictly respects the domain constraints.

Reflected Overdamped Langevin Dynamics

For comparison, the reflected SDE is given by:

$X_t = X_0 + \int_0^t b(X_s) ds + \sqrt{2} W_t - \int_0^t n(X_s) dL_s,$

where $L_s$ is the local time at the boundary, enforcing reflection via a non-decreasing process. The corresponding Fokker-Planck equation imposes Neumann boundary conditions.

Numerical Schemes and Implementation

Splitting Methods for Confined Langevin Dynamics

The authors develop and analyze several splitting schemes for simulating the confined Langevin process, including [A $_c$ OA $_c$ ], [SA $_c$ OA $_c$ S], and CBBK-S, which combine Ornstein-Uhlenbeck updates with collision handling via specular reflection. The collision step is efficiently implemented by computing the time of boundary crossing and updating the velocity according to the reflection law.

For box constraints, the collision procedure simplifies to coordinate-wise reflection, enabling efficient implementation in high dimensions.

Schemes for Reflected SDEs

Multiple methods for enforcing constraints in reflected SDEs are compared:

Projection: After each Euler step, project out-of-domain points onto the boundary.
Symmetrized Reflection: Reflect the trajectory across the boundary, achieving first-order weak accuracy.
Penalty: Penalize excursions outside the domain via a drift term.
Barrier: Add a barrier potential near the boundary, diverging as the trajectory approaches $\partial G$ .

The symmetrized reflection and projection methods are shown to be effective for convex domains, with rigorous convergence guarantees.

Score Matching Loss Functions

A key theoretical contribution is the derivation of tractable score matching losses for both confined and reflected models. For the confined Langevin model, the loss is:

$\mathbb{E}_{t, X_t, V_t} \left[ |s_\theta(t, X_t, V_t)|^2 + 2 \operatorname{div}_v s_\theta(t, X_t, V_t) \right] + C,$

where $s_\theta$ is the score network.

For the reflected model, the loss includes a boundary correction term involving the local time:

$\mathbb{E}_{t, X_t} \left[ |s_\theta(t, X_t)|^2 + 2 \operatorname{div} s_\theta(t, X_t) - \frac{2}{t} \int_0^t \langle s_\theta(r, X_r), n(X_r) \rangle dL_r \right] + C,$

The paper demonstrates that the local time term is tractable and can be efficiently estimated via Monte Carlo, contradicting previous claims that it is intractable.

Empirical Evaluation

Experiments on toy datasets (Gaussian mixtures, wheel, maze, flower) and MNIST demonstrate that the confined Langevin models with specular reflection consistently generate samples that strictly respect constraints, outperforming DDPM and reflected SDE models in terms of Maximum Mean Discrepancy (MMD) and constraint violation rates. For MNIST, the confined Langevin model achieves a lower Fréchet Inception Distance (FID) compared to DDPM with clamping.

Strong numerical results include:

Zero constraint violations for all confined Langevin and symmetrized reflection models.
Significantly lower MMD for confined Langevin models (e.g., CBBK-S: $0.28 \pm 0.01 \times 10^{-2}$ ) compared to DDPM and penalty methods.
Lower FID for MNIST ($69.85$ for CLD vs. $185.79$ for DDPM).

Theoretical and Practical Implications

The introduction of specularly reflected kinetic Langevin dynamics provides a principled approach for generative modeling under hard constraints, with strict theoretical guarantees on constraint satisfaction. The tractable score matching loss enables efficient training without imposing artificial conditions on the score network at the boundary.

The comparison of numerical schemes highlights the importance of accurate boundary handling for both sampling quality and computational efficiency. The splitting methods for kinetic Langevin dynamics are shown to converge with optimal rates, and the symmetrized reflection scheme for overdamped SDEs achieves first-order weak accuracy.

Limitations and Future Directions

The paper notes the absence of large-scale experiments, focusing primarily on toy datasets and a subset of MNIST. Extending these methods to high-dimensional, real-world domains (e.g., molecular generation, physical simulation) and integrating with advanced architectures (e.g., U-Nets, transformers) remains an open direction.

Further research could explore:

Adaptive splitting schemes for complex geometries.
Extension to manifold constraints and non-Euclidean domains.
Integration with classifier-free guidance and conditional generation.
Theoretical analysis of ergodicity and mixing times in high dimensions.

Conclusion

This work establishes a rigorous framework for score-based generative modeling under explicit constraints, leveraging kinetic Langevin dynamics with specular reflection and providing tractable loss functions for score matching. The proposed methods achieve strict constraint satisfaction and superior sample quality compared to existing approaches, with strong theoretical foundations and practical numerical schemes. The results have significant implications for constrained generative modeling in scientific and engineering applications, and open avenues for further research in high-dimensional and manifold-constrained domains.