Papers
Topics
Authors
Recent
Search
2000 character limit reached

Generative quantum machine learning via denoising diffusion probabilistic models

Published 9 Oct 2023 in quant-ph, cs.AI, and cs.LG | (2310.05866v4)

Abstract: Deep generative models are key-enabling technology to computer vision, text generation, and LLMs. Denoising diffusion probabilistic models (DDPMs) have recently gained much attention due to their ability to generate diverse and high-quality samples in many computer vision tasks, as well as to incorporate flexible model architectures and a relatively simple training scheme. Quantum generative models, empowered by entanglement and superposition, have brought new insight to learning classical and quantum data. Inspired by the classical counterpart, we propose the quantum denoising diffusion probabilistic model (QuDDPM) to enable efficiently trainable generative learning of quantum data. QuDDPM adopts sufficient layers of circuits to guarantee expressivity, while it introduces multiple intermediate training tasks as interpolation between the target distribution and noise to avoid barren plateau and guarantee efficient training. We provide bounds on the learning error and demonstrate QuDDPM's capability in learning correlated quantum noise model, quantum many-body phases, and topological structure of quantum data. The results provide a paradigm for versatile and efficient quantum generative learning.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (31)
  1. P. Rebentrost, M. Mohseni, and S. Lloyd, Phys. Rev. Lett. 113, 130503 (2014).
  2. S. Lloyd, M. Mohseni, and P. Rebentrost, Nat. Phys. 10, 631 (2014).
  3. I. Cong, S. Choi, and M. D. Lukin, Nat. Phys. 15, 1273 (2019).
  4. X. Gao, Z.-Y. Zhang, and L.-M. Duan, Sci. Adv. 4, eaat9004 (2018).
  5. S. Lloyd and C. Weedbrook, Phys. Rev. Lett. 121, 040502 (2018).
  6. P.-L. Dallaire-Demers and N. Killoran, Phys. Rev. A 98, 012324 (2018).
  7. J. Ho, A. Jain, and P. Abbeel, Adv. Neural Inf. Process. 33, 6840 (2020).
  8. Y. Chen, T. T. Georgiou, and M. Pavon, Journal of Optimization Theory and Applications 169, 671 (2016).
  9. P. Dhariwal and A. Nichol, Adv. Neural Inf. Process. 34, 8780 (2021).
  10. A. Nahum, S. Vijay, and J. Haah, Phys. Rev. X 8, 021014 (2018).
  11. R. Harper, S. T. Flammia, and J. J. Wallman, Nat. Phys. 16, 1184 (2020).
  12. Q. Zhuang and Z. Zhang, Phys. Rev. X 9, 041023 (2019).
  13. C. Ortiz Marrero, M. Kieferová, and N. Wiebe, PRX Quantum 2, 040316 (2021).
  14. See Supplementary Information .
  15. C. Villani, Topics in Optimal Transportation, Graduate studies in mathematics (American Mathematical Society, 2003).
  16. O. Oreshkov and J. Calsamiglia, Phys. Rev. A 79, 032336 (2009).
  17. F. G. Brandao, A. W. Harrow, and M. Horodecki, Commun. Math. Phys. 346, 397 (2016).
  18. A. W. Harrow and S. Mehraban, Commun. Math. Phys. 401 (2023).
  19. L. Banchi, J. Pereira, and S. Pirandola, PRX Quantum 2, 040321 (2021).
  20. N. Srebro, K. Sridharan, and A. Tewari, Adv. Neural Inf. Process. Syst. 23 (2010).
  21. R. Yao, X. Chen, and Y. Yang, in Conference on Learning Theory (PMLR, 2022) pp. 2242–2275.
  22. M. Schuld, arXiv preprint arXiv:2101.11020  (2021).
  23. J. F. Rodriguez-Nieva and M. S. Scheurer, Nat. Phys. 15, 790 (2019).
  24. X. Chen and Y. Yang, Appl. Comput. Harmon. Anal. 52, 303 (2021).
  25. R. R. Coifman and S. Lafon, Appl. Comput. Harmon. Anal. 21, 5 (2006), special Issue: Diffusion Maps and Wavelets.
  26. B. Skinner, J. Ruhman, and A. Nahum, Phys. Rev. X 9, 031009 (2019).
  27. B. Zhang and Q. Zhuang, Quantum. Sci. Technol. 7, 035017 (2022).
  28. T. Hsing and R. Eubank, Theoretical Foundations of Functional Data Analysis, with an Introduction to Linear Operators, Wiley Series in Probability and Statistics (Wiley, 2015).
  29. G. Peyré and M. Cuturi, Foundations and Trends in Machine Learning 11, 355 (2019).
  30. J. A. Smolin and D. P. DiVincenzo, Phys. Rev. A 53, 2855 (1996).
  31. J. R. Johansson, P. D. Nation, and F. Nori, Comput. Phys. Commun. 183, 1760 (2012).
Citations (13)

Summary

  • The paper presents a novel QuDDPM framework that extends classical diffusion models to generate quantum state ensembles via forward scrambling and backward measurement-based denoising.
  • The study employs incremental training with intermediate loss functions to mitigate barren plateaus, optimizing parameterized quantum circuits for each diffusion step.
  • The paper demonstrates improved performance over benchmark quantum models in tasks such as learning correlated quantum noise and many-body phase transitions.

Quantum Denoising Diffusion Probabilistic Models (QuDDPMs) extend the principles of classical denoising diffusion probabilistic models to the quantum domain, providing a framework for learning distributions of quantum states, referred to as quantum data ensembles. Unlike classical DDPMs that operate on probability distributions over classical data, QuDDPMs aim to generate samples from a target distribution D0D_0 consisting of quantum states %%%%1%%%%. This approach leverages the structure of diffusion models to enable trainable generative learning for complex quantum data.

Methodology of QuDDPM

The QuDDPM framework consists of two main processes: a forward quantum scrambling process and a backward measurement-based denoising process.

Forward Quantum Scrambling Process

The forward process transforms samples from the target quantum data ensemble D0D_0 into a maximally mixed or random noise ensemble DTD_T through a sequence of TT steps.

  1. Start with a set of quantum states {ψk(0)}\{\ket{\psi_k^{(0)}}\} drawn from the target distribution D0D_0.
  2. For each state ψk(0)\ket{\psi_k^{(0)}}, apply a sequence of TT random unitary operations Ut(k)U_t^{(k)} drawn from a suitable distribution (e.g., Haar random or unitary t-designs). This defines a sequence of intermediate ensembles: Dt={ψk(t)=Ut(k)U1(k)ψk(0)}kD_t = \{\ket{\psi_k^{(t)}} = U_t^{(k)} \dots U_1^{(k)} \ket{\psi_k^{(0)}} \}_{k} for t=1,,Tt = 1, \dots, T.
  3. Each step tt adds noise, gradually scrambling the initial structure. The ensemble DTD_T approximates a distribution of random quantum states (e.g., states drawn uniformly from the Hilbert space, approaching the maximally mixed state in the ensemble average). The sequence D0,D1,,DTD_0, D_1, \dots, D_T serves as an interpolation between the target data distribution and the noise distribution.

Backward Measurement-Based Denoising Process

The backward process aims to reverse the forward scrambling, starting from the noise ensemble and progressively denoising it to approximate the target ensemble D0D_0. This involves trainable Parametrized Quantum Circuits (PQCs).

  1. Initialize the process by sampling states ψ~(T)\ket{\tilde{\psi}^{(T)}} from a noise distribution S~T\tilde{S}_T that approximates DTD_T.
  2. For steps t=T,T1,,1t = T, T-1, \dots, 1, apply a trainable unitary U~t(θt)\tilde{U}_t(\theta_t) to the current system state ψ~(t)\ket{\tilde{\psi}^{(t)}} and ancillary qubits initialized to 0na\ket{0}^{\otimes n_a}. The PQC U~t(θt)\tilde{U}_t(\theta_t) acts on the joint system-ancilla space.
  3. Perform a projective measurement on the ancilla qubits. The post-measurement state of the system qubits, conditioned on a specific outcome (typically 0na\ket{0}^{\otimes n_a}), becomes the input for the next step, ψ~(t1)\ket{\tilde{\psi}^{(t-1)}}. Measurements are crucial for the contractive nature of the denoising map while preserving state purity.
  4. The sequence of operations generates ensembles D~t\tilde{D}_t intended to approximate the forward ensembles DtD_t. The final output ensemble D~0={ψ~(0)}\tilde{D}_0 = \{\ket{\tilde{\psi}^{(0)}}\} approximates the target distribution D0D_0.

Training Strategy and Mitigation of Barren Plateaus

Training QuDDPMs involves optimizing the parameters {θt}\{\theta_t\} of the backward PQCs U~t\tilde{U}_t. A key challenge in training deep PQCs is the barren plateau phenomenon, where gradients vanish exponentially with the number of qubits or circuit depth. QuDDPM addresses this using a divide-and-conquer strategy inherent in the diffusion model structure.

  1. Incremental Training: Training proceeds backward from t=Tt=T down to t=1t=1. In cycle (T+1t)(T+1-t), the parameters θt\theta_t of the PQC U~t\tilde{U}_t are optimized.
  2. Intermediate Loss Functions: The objective at step tt is to minimize a distance metric between the generated ensemble D~t1\tilde{D}_{t-1} (obtained by applying U~T,,U~t\tilde{U}_T, \dots, \tilde{U}_t starting from noise S~T\tilde{S}_T) and the corresponding forward-diffused ensemble Dt1D_{t-1}.
  3. Shallow Depth per Step: The total PQC required to transform noise into the target data might need significant depth, potentially Ω(n)\Omega(n) layers for nn qubits, leading to barren plateaus if trained end-to-end. By breaking the process into TT steps, where TT can be chosen strategically (e.g., Tn/lognT \sim n/\log n), each individual PQC U~t\tilde{U}_t can have a relatively shallow depth (e.g., O(logn)O(\log n) layers). Training these shallower circuits individually mitigates the barren plateau problem, enabling efficient optimization. The theoretical analysis suggests that the loss landscape for each step is better behaved than for a single deep circuit.

Loss Functions for Quantum State Ensembles

Since analytical likelihood calculations used in classical DDPMs are generally intractable for quantum states, QuDDPM relies on distance metrics between quantum state ensembles, estimated via quantum measurements.

  1. Maximum Mean Discrepancy (MMD): MMD compares two distributions based on the mean embeddings of their samples in a reproducing kernel Hilbert space. For QuDDPM, the kernel is often chosen based on fidelity: k(ρ,σ)=Tr(ρσ)=ψρψσ2k(\rho, \sigma) = \text{Tr}(\rho \sigma) = |\langle\psi_\rho|\psi_\sigma\rangle|^2 for pure states ψρ,ψσ\ket{\psi_\rho}, \ket{\psi_\sigma}. The MMD loss requires estimating pairwise fidelities between states sampled from the target ensemble Dt1D_{t-1} and the generated ensemble D~t1\tilde{D}_{t-1}.
  2. Wasserstein Distance (Quantum EMD): The Wasserstein-1 distance (or Earth Mover's Distance, EMD) measures the minimum cost to transport one distribution to another. In the quantum context, using the infidelity c(ϕ,ψ)=1ϕψ2c(\ket{\phi}, \ket{\psi}) = 1 - |\langle\phi|\psi\rangle|^2 as the cost function allows capturing geometric structures in the distribution of quantum states.
  3. Estimation: Both MMD and Wasserstein distance calculations typically require estimating pairwise fidelities ψkψ~j2|\langle\psi_k|\tilde{\psi}_j\rangle|^2 between states from the two ensembles. This can be achieved using the SWAP test or related quantum algorithms on a quantum computer, requiring multiple copies of the states or controlled operations.

Applications and Performance

The paper demonstrates QuDDPM's capabilities on several quantum learning tasks:

  1. Learning Correlated Quantum Noise: QuDDPM effectively learned the distribution of states produced by applying probabilistic coherent errors (e.g., XXXX and ZZZZ rotations with certain probabilities) to an initial state like 0n\ket{0}^{\otimes n}. This is relevant for characterizing noise in quantum devices.
  2. Learning Quantum Many-Body Phases: The model successfully learned to generate ground states of the 1D Transverse-Field Ising Model (TFIM) within its ferromagnetic phase, showcasing its ability to capture complex correlations typical of many-body systems.
  3. Learning Topological Structures: Using the Wasserstein distance as the loss function, QuDDPM learned to generate an ensemble of single-qubit states forming a specific topological structure (a ring on the XX-ZZ plane of the Bloch sphere). This highlights its sensitivity to the geometric arrangement of states in Hilbert space.

Numerical simulations indicated that QuDDPM outperformed benchmark quantum generative models like generalized versions of QuGAN and Quantum Direct Transport (QuDT) in generating target quantum state ensembles, particularly demonstrating better sample quality and training stability. Theoretical bounds on the learning error were also provided, connecting the performance to the number of diffusion steps TT, circuit depth per step, and the number of training samples.

Implementation Considerations

Implementing QuDDPM involves several practical choices:

  • Number of Diffusion Steps (TT): A larger TT allows for shallower circuits per step, potentially improving trainability, but increases the total number of training cycles and circuit executions. The choice often involves a trade-off, with Tn/lognT \sim n/\log n suggested as a heuristic.
  • PQC Architecture: The expressivity and entanglement capability of the PQCs U~t\tilde{U}_t are crucial. Hardware-efficient ansätze or problem-specific designs might be employed.
  • Ancilla Qubits: The backward denoising process requires ancilla qubits for measurements. The number of ancillas affects the complexity and the nature of the effective quantum channel implemented by each step.
  • Measurement Cost: Estimating the loss function (MMD or Wasserstein) requires numerous fidelity estimations (e.g., via SWAP tests), which can be resource-intensive on current quantum hardware.
  • Computational Resources: Training involves repeated execution of quantum circuits (forward process simulation or execution, backward PQC execution, and measurement for loss estimation) coupled with classical optimization of PQC parameters. This requires significant classical and quantum computational resources.

In conclusion, QuDDPM offers a structured and potentially more trainable approach to quantum generative learning compared to monolithic PQC-based models like QuGANs, particularly for complex distributions of quantum states. Its effectiveness stems from the divide-and-conquer strategy inherent in diffusion models, which helps mitigate barren plateaus by breaking the generation task into smaller, manageable denoising steps. The framework has shown promise in learning physically relevant quantum data distributions, although practical implementation faces challenges related to measurement overhead and resource requirements.

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 2 tweets with 0 likes about this paper.