Papers
Topics
Authors
Recent
2000 character limit reached

Quantum Denoising Diffusion Models

Updated 30 January 2026
  • Quantum Denoising Diffusion Probabilistic Models are advanced generative frameworks that extend classical diffusion processes to quantum state spaces using CPTP maps.
  • They utilize forward quantum Markovian dynamics for controlled noising and employ variational quantum circuits and score matching for effective reverse denoising.
  • The framework addresses practical challenges such as noncommuting observables and barren plateaus, offering scalable training and improved fidelity in high-dimensional Hilbert spaces.

Quantum denoising diffusion probabilistic models (QDDPMs) generalize classical denoising diffusion models to quantum state spaces. They employ quantum Markovian dynamics—implemented as completely positive trace-preserving (CPTP) maps—for forward noising and harness variationally parameterized quantum channels for reverse denoising, enabling generative modeling of high-dimensional classical data, quantum pure states, and mixed quantum states. Technical challenges addressed in the quantum setting include handling noncommuting observables, enforcing physical structure constraints (Hermiticity, positive semidefiniteness, trace), and circumventing barren plateaus in variational optimization. The QDDPM framework represents a convergence of open quantum system theory, variational quantum circuits, and modern generative modeling.

1. Mathematical Formulation of Quantum Diffusion

Denoising diffusion in the quantum domain requires the formalization of noising and denoising processes over spaces of density operators (pure or mixed states). The generic setup represents the system state at step tt as a density matrix ρtC2n×2n\rho_t\in\mathbb{C}^{2^n\times 2^n} evolving under a forward Markovian channel: ρt=Et(ρt1)=kKt,kρt1Kt,k\rho_t = \mathcal{E}_t(\rho_{t-1}) = \sum_{k} K_{t,k} \rho_{t-1} K_{t,k}^\dagger with kKt,kKt,k=I\sum_k K_{t,k}^\dagger K_{t,k}=I ensuring complete positivity and trace preservation (Zhang et al., 2023, Kölle et al., 2024, Zhu et al., 15 Nov 2025, Kwun et al., 2024, Chen et al., 8 May 2025, Zhu et al., 2024).

Two principal classes of forward processes are used:

  • Random Unitary (Scrambling) Channels: Each step applies a random unitary UtU_t (with angle schedule or implemented as fast-scrambling circuits), yielding decoherence in the computational basis and, in the large TT limit, convergence to the maximally mixed (or Haar) ensemble (Zhang et al., 2023, Cao et al., 7 Dec 2025, Quinn et al., 22 Sep 2025).
  • Depolarizing Channels: Each step applies a partial depolarizing map,

Et(ρ)=(1qt)ρ+qtId\mathcal{E}_t(\rho) = (1 - q_t) \rho + q_t \frac{I}{d}

with d=2nd=2^n the Hilbert space dimension. The noise schedule (qtq_t and its associated interpolating function) can be selected to control purity decay (Kwun et al., 2024, Chen et al., 8 May 2025, Zhu et al., 15 Nov 2025, Parigi et al., 2023).

Continuous-time analogs employ Lindblad master equations: dρdt=i[Hs,ρ]+jγj(LjρLj12{LjLj,ρ})\frac{d\rho}{dt} = -i[H_s,\rho] + \sum_j \gamma_j \left(L_j \rho L_j^\dagger - \tfrac12\{L_j^\dagger L_j, \rho\}\right) with LjL_j the jump operators and γj\gamma_j rates, under Markovian and weak-coupling assumptions (Zhu et al., 15 Nov 2025, Parigi et al., 2023).

Special attention is required for structure-preservation in mixed-state QDDPMs. The structure-preserving diffusion model (SPDM) enforces Hermiticity, positive semidefiniteness, and trace-one using mirror maps based on the von Neumann entropy: V(X)=I+logX,V(Y)=exp(YI)V(X) = I + \log X, \qquad V^*(Y) = \exp(Y-I) allowing unconstrained Gaussian diffusion in dual (mirror) space, with physical constraints guaranteed upon pullback (Zhu et al., 2024).

2. Reverse (Denoising) Process and Variational Channel Parameterization

The reverse process seeks to invert the forward quantum channel sequence. Since the exact reversal is generally infeasible, a parameterized quantum channel is trained to map the noisy marginals back to the target state distribution.

Typical constructions employ a variational quantum circuit (PQC) ansatz:

Score-based reverse SDEs have also been derived in mirror space, where a neural network (score function) is trained to approximate Ylogpt(Y)\nabla_Y \log p_t(Y) for dual variables Y=V(X)Y=V(X) (Zhu et al., 2024).

In discrete variable and quantum-classical hybrid models, the reverse process can be implemented by PQCs that directly output logits for the conditional probabilities, with sampling performed in a single quantum-circuit evaluation using temporal encoding (Chen et al., 8 May 2025, Falco et al., 19 Jan 2025).

3. Training Objectives, Loss Functions, and Optimization

QDDPM training is based on matching the statistics of the denoised outputs to the data ensemble. The main classes of cost functions are:

DMMD(A,B)=Fˉ(A,A)+Fˉ(B,B)2Fˉ(A,B)\mathcal{D}_{\rm MMD}(A,B) = \bar F(A,A) + \bar F(B,B) - 2\bar F(A,B)

where Fˉ(X,Y)=EψX,ϕY[ψϕ2]\bar F(X,Y)=\mathbb{E}_{\psi\in X,\,\phi\in Y}[|\langle\psi|\phi\rangle|^2] is the pairwise fidelity (Zhang et al., 2023, Zhu et al., 2024, Quinn et al., 22 Sep 2025, Cao et al., 7 Dec 2025).

  • Superfidelity-based Loss:

For mixed states, the superfidelity

G(ρ,σ)=Tr(ρσ)+(1Trρ2)(1Trσ2)G(\rho,\sigma) = \operatorname{Tr}(\rho \sigma) + \sqrt{(1-\operatorname{Tr}\rho^2)(1-\operatorname{Tr}\sigma^2)}

is used in MMD or Wasserstein objectives, avoiding full state tomography (Kwun et al., 2024).

  • Path-constrained Loss (PC):

Penalizes deviations at each intermediate time with

Lpath=[1F(ρ0,ρ^0)]+λtαt[1F(ρt,ρ^t)]\mathcal{L}_{\mathrm{path}} = [1-F(\rho_0,\hat{\rho}_0)] + \lambda \sum_t \alpha_t [1-F(\rho_t,\hat{\rho}_t)]

for fidelity FF and weights αt\alpha_t (Zhu et al., 15 Nov 2025).

  • Denoising Score Matching: Used in SPDM, one minimizes

L(θ)=E[sθ(Yt,t)+ϵ/1αˉt2]L(\theta) = \mathbb{E}\left[\|s_\theta(Y_t,t) + \epsilon/\sqrt{1-\bar{\alpha}_t}\|^2\right]

connecting directly to classical score-based diffusion (Zhu et al., 2024, Kölle et al., 2024).

Gradient computation involves classical backpropagation in simulations or the parameter-shift rule when running on quantum hardware. Optimization typically uses Adam or similar stochastic optimizers, with explicit parameter initialization and regularization via noise schedule selection (Zhu et al., 15 Nov 2025, Kölle et al., 2024, Kwun et al., 2024).

4. Structure Preservation, Conditioning, and Model Extensions

Quantum state spaces are highly structured: density operators must be complex Hermitian, positive semidefinite, and trace one. SPDM achieves strict enforcement via mirror map reparameterization and normalization at each generation step (Zhu et al., 2024).

Conditioning and label guidance play a crucial role in enabling conditional generation (e.g., class-conditional state synthesis or interpolation across entanglement classes). Approaches include:

  • Classifier-free Guidance: Simultaneously training unconditional and conditional denoisers, with label dropout and convex interpolation at generation time (Zhu et al., 2024).
  • Ancilla-based Continuous Conditioning: Rotational encoding of class parameters into ancilla registers enables a single denoiser to interpolate across multiple target distributions, with empirically demonstrated order-of-magnitude improvements in fidelity and MMD relative to unconditioned baselines (Quinn et al., 22 Sep 2025).

Other significant architectural advances include:

  • One-step and Latent Diffusion: QD3PM enables single-shot sampling from joint distributions by directly learning pθ(x0xT)p_\theta(\mathbf{x}_0|\mathbf{x}_T) using quantum circuits, avoiding classical factorization bottlenecks and depth scaling (Chen et al., 8 May 2025). Hybrid architectures (e.g., quantum latent diffusion) operate in low-dimensional classical latent spaces to reduce circuit depth and facilitate deployment on NISQ hardware (Falco et al., 19 Jan 2025).
  • Channel-Constrained and Open-system Models: CCMQD strictly realizes both forward diffusion and reverse denoising via CPTP maps parameterized as Kraus operators, optimized for physical trace preservation and complete positivity, and connects naturally to open quantum system dynamics (Zhu et al., 15 Nov 2025, Parigi et al., 2023).

5. Empirical Results, Scalability, and Quantum Advantage

Quantum DDPMs have been validated in a range of settings, from simulation of structured pure states and mixed ensembles to generative modeling of classical images in latent quantum space:

  • Pure-state generation: QuDDPM accurately models correlated quantum noise channels, many-body ground state phases, and topological state families, with two orders of magnitude lower MMD error than GAN or direct-transport baselines under equivalent parameter constraints (Zhang et al., 2023).
  • Mixed-state generation: MSQuDDPM produces desired ensembles with mean fidelities 0.98\geq 0.98 using shallow hardware-efficient ansatzes. Classifier-free and continuous-conditioning schemes robustly interpolate and extrapolate entanglement and magnetization (Kwun et al., 2024, Quinn et al., 22 Sep 2025).
  • Image synthesis benchmarks: QD3PM (for discrete data) and QLDM (latent variable models) achieve lower KL divergence, FID, and KID than classical diffusion models of comparable parameter counts, with QLDM outperforming classical baselines in few-shot learning (Chen et al., 8 May 2025, Falco et al., 19 Jan 2025).
  • Resource scaling: Recent architectures leverage noise schedules (cosine-exponent, small-angle scrambling), shallow parameterizations, and structure-preserving transformations to support training in Hilbert spaces up to 2102^{10} dimensions (Zhu et al., 2024, Kwun et al., 2024, Chen et al., 8 May 2025, Cao et al., 7 Dec 2025).

A central quantum advantage is the potential to capture and sample from genuinely high-dimensional, entangled, and nonfactorizing joint distributions that are intractable for classical factorized DPMs, both in terms of memory and sampling depth (Chen et al., 8 May 2025). These advances demonstrate practical pathways for NISQ-era generative modeling of both quantum and classical data.

6. Barren Plateau Phenomenon and Scalability Challenges

A significant issue encountered in QDDPMs is the occurrence of barren plateaus—gradient suppression in variational quantum learning—especially when the forward diffusion process rapidly converges to tt-design (e.g., Haar) ensembles. In such regimes, the gradient variance vanishes exponentially in the system size nn: Var(θL)O(22n)\operatorname{Var}(\partial_\theta \mathcal{L}) \leq O(2^{-2n}) This severely limits scalability, as training stagnates for n7n\gtrsim 7 qubits in standard QuDDPM with fully randomizing diffusion steps (Cao et al., 7 Dec 2025). Theoretical analysis reveals that restricting the forward process to remain at a finite MMD distance from Haar (e.g., via angle limitation or reduced circuit depth) restores nonzero gradients:

  • Barren-plateau-mitigated QuDDPM employs controlled-noise schedules (limited-angle, shallow circuits), achieving constant gradient magnitudes O(101)O(10^{-1}) and sample fidelity improvements of an order of magnitude (Cao et al., 7 Dec 2025).
  • Similar techniques—shallow circuits, hardware-efficient ansatz restriction, and path-constrained losses—empirically stabilize training in MSQuDDPM and CCMQD models as well (Kwun et al., 2024, Zhu et al., 15 Nov 2025).

7. Comparison to Classical Diffusion Models

Quantum DDPMs generalize and transcend their classical counterparts as follows:

  • State representation: Classical DPMs act on xtRd\mathbf{x}_t\in\mathbb{R}^d (or discrete variables) via Gaussian/additive or categorical kernels; QDDPMs act on ρtCd×d\rho_t\in\mathbb{C}^{d\times d} via CPTP maps, encompassing noncommuting observables and genuine quantum correlations (Zhang et al., 2023, Zhu et al., 2024, Chen et al., 8 May 2025).
  • Noising and denoising: The quantum framework enables non-factorizing joint channels, structure-preserving transforms, single-shot sampling from joint distributions, and embedding classically intractable dependencies in the Hilbert space.
  • Optimization: Losses are variationally computed using MMD, superfidelity, path constraints, and denoising-score matching (in mirror or dual space), with explicit enforcement of quantum physical constraints.
  • Quantum advantage: Joint-distribution learning, single-step sampling, and compression of conditional dependencies into quantum circuits provide theoretical and empirical evidence of quantum advantage for generative modeling, especially for high-dimensional or strongly correlated distributions (Chen et al., 8 May 2025, Falco et al., 19 Jan 2025).

References:

Topic to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Quantum Denoising Diffusion Probabilistic Models.