Discrete generative diffusion models without stochastic differential equations: a tensor network approach (2407.11133v1)

Published 15 Jul 2024 in cond-mat.stat-mech, cond-mat.dis-nn, and cs.LG

Abstract: Diffusion models (DMs) are a class of generative machine learning methods that sample a target distribution by transforming samples of a trivial (often Gaussian) distribution using a learned stochastic differential equation. In standard DMs, this is done by learning a score function'' that reverses the effect of adding diffusive noise to the distribution of interest. Here we consider the generalisation of DMs to lattice systems with discrete degrees of freedom, and where noise is added via Markov chain jump dynamics. We show how to use tensor networks (TNs) to efficiently define and sample suchdiscrete diffusion models'' (DDMs) without explicitly having to solve a stochastic differential equation. We show the following: (i) by parametrising the data and evolution operators as TNs, the denoising dynamics can be represented exactly; (ii) the auto-regressive nature of TNs allows to generate samples efficiently and without bias; (iii) for sampling Boltzmann-like distributions, TNs allow to construct an efficient learning scheme that integrates well with Monte Carlo. We illustrate this approach to study the equilibrium of two models with non-trivial thermodynamics, the $d=1$ constrained Fredkin chain and the $d=2$ Ising model.

Authors (3)

Luke Causer (15 papers)
Grant M. Rotskoff (41 papers)
Juan P. Garrahan (136 papers)

Citations (1)

View on Semantic Scholar

Summary

Overview of "Discrete Generative Diffusion Models Without Stochastic Differential Equations: A Tensor Network Approach"

This paper presents a novel approach to discrete generative diffusion models (DGDMs) using tensor networks (TNs), without relying on stochastic differential equations. The authors propose a method to efficiently represent, sample, and learn from lattice systems with discrete degrees of freedom. They generalize the concept of diffusion models (DMs) to cases where noise is added via Markov chain jump dynamics instead of continuous diffusion, thereby addressing challenges in standard DMs associated with calculating time-dependent forces and resolving mismatches in time evolution.

Main Contributions

Tensor Network Representation: The authors demonstrate how TNs, particularly matrix product states (MPS), and matrix product operators (MPO), can exactly represent and evolve probabilities in discrete diffusion models. This parametrization allows the denoising dynamics—essential to DMs—to be realized exactly, which is a significant departure from traditional methodologies relying on approximate solutions of stochastic differential equations.
Sampling and Efficiency: The auto-regressive property of TNs is leveraged to generate samples efficiently without bias, which is crucial for scaling these methods to larger systems. The paper outlines how TNs naturally facilitate the efficient execution of noising and denoising protocols via Markov processes, deviating from conventional continuous-time frameworks.
Integration with Monte Carlo Methods: An efficient learning scheme is constructed for sampling Boltzmann-like distributions, integrating the proposed DGDMs with Monte Carlo (MC) methods. Specifically, they detail a technique for training MPS to approximate a target distribution, optimizing the parameters via an iterative MC process.
Applications to Non-Trivial Thermodynamics Models: The approach is tested on models with challenging thermodynamic properties: the one-dimensional constrained Fredkin chain and the two-dimensional Ising model. The paper shows that TN-enhanced DGDMs provide insightful results into the equilibrium properties of such systems, using the efficient learning schemes proposed.

Implications and Future Work

The implications of this research are twofold: practical and theoretical. Practically, this method offers a new avenue for sampling complex high-dimensional discrete distributions efficiently, which has applications across numerous fields where stochastic modeling is critical, such as statistical physics, machine learning, and complex system simulations. Theoretically, it demonstrates the power of combining TNs with generative diffusion models, offering a unique perspective in understanding and leveraging spatial correlations within lattice systems, potentially extending to even more complex topologies using network models such as PEPS or TTNs.

The paper also opens directions for future research that could explore the theoretical bounds and limitations of this approach in higher dimensions, more complex network architectures, and exploring the symmetries inherent in different stochastic systems.

The use of TNs in defining and understanding generative models could potentially synergize with other machine learning paradigms, leading to advancements in both computational methods and theoretical models of large-scale systems. Further exploration into the integration of this method with existing AI and ML tools could yield significant advancement in the fields of structured generative models and stochastic optimization.

Related Papers

Find Related Papers

Tweets

https://twitter.com/GrantRotskoff/status/1813616129644138791