Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
149 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Theory of overparametrization in quantum neural networks (2109.11676v1)

Published 23 Sep 2021 in quant-ph, cs.LG, and stat.ML

Abstract: The prospect of achieving quantum advantage with Quantum Neural Networks (QNNs) is exciting. Understanding how QNN properties (e.g., the number of parameters $M$) affect the loss landscape is crucial to the design of scalable QNN architectures. Here, we rigorously analyze the overparametrization phenomenon in QNNs with periodic structure. We define overparametrization as the regime where the QNN has more than a critical number of parameters $M_c$ that allows it to explore all relevant directions in state space. Our main results show that the dimension of the Lie algebra obtained from the generators of the QNN is an upper bound for $M_c$, and for the maximal rank that the quantum Fisher information and Hessian matrices can reach. Underparametrized QNNs have spurious local minima in the loss landscape that start disappearing when $M\geq M_c$. Thus, the overparametrization onset corresponds to a computational phase transition where the QNN trainability is greatly improved by a more favorable landscape. We then connect the notion of overparametrization to the QNN capacity, so that when a QNN is overparametrized, its capacity achieves its maximum possible value. We run numerical simulations for eigensolver, compilation, and autoencoding applications to showcase the overparametrization computational phase transition. We note that our results also apply to variational quantum algorithms and quantum optimal control.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (53)
  1. M. Mohri, A. Rostamizadeh, and A. Talwalkar, Foundations of Machine Learning (MIT Press, 2018).
  2. A. L. Blum and R. L. Rivest, Training a 3-node neural network is np-complete, Neural Networks 5, 117 (1992).
  3. A. Daniely, Complexity theoretic limitations on learning halfspaces, in Proceedings of the forty-eighth annual ACM symposium on Theory of Computing (2016) pp. 105–117.
  4. D. Boob, S. S. Dey, and G. Lan, Complexity of training relu neural network, Discrete Optimization , 100620 (2020).
  5. Z. Allen-Zhu, Y. Li, and Z. Song, A convergence theory for deep learning via over-parameterization, in International Conference on Machine Learning (PMLR, 2019) pp. 242–252.
  6. M. A. Nielsen and I. L. Chuang, Quantum Computation and Quantum Information (Cambridge University Press, 2000).
  7. J. Preskill, Quantum computing in the nisq era and beyond, Quantum 2, 79 (2018).
  8. M. Schuld, I. Sinayskiy, and F. Petruccione, An introduction to quantum machine learning, Contemporary Physics 56, 172 (2015).
  9. H.-Y. Huang, R. Kueng, and J. Preskill, Information-theoretic bounds on quantum advantage in machine learning, Phys. Rev. Lett. 126, 190505 (2021a).
  10. J. M. Kübler, S. Buchholz, and B. Schölkopf, The inductive bias of quantum kernels, arXiv preprint arXiv:2106.03747  (2021).
  11. L. Bittel and M. Kliesch, Training variational quantum algorithms is np-hard, Phys. Rev. Lett. 127, 120502 (2021).
  12. I. Cong, S. Choi, and M. D. Lukin, Quantum convolutional neural networks, Nature Physics 15, 1273 (2019).
  13. E. Farhi and H. Neven, Classification with quantum neural networks on near term processors, arXiv preprint arXiv:1802.06002  (2018).
  14. D. Wierichs, C. Gogolin, and M. Kastoryano, Avoiding local minima in variational quantum eigensolvers with the natural gradient optimizer, Physical Review Research 2, 043246 (2020).
  15. C. O. Marrero, M. Kieferová, and N. Wiebe, Entanglement induced barren plateaus, arXiv preprint arXiv:2010.15968  (2020).
  16. M. Cerezo and P. J. Coles, Higher order derivatives of quantum neural networks with barren plateaus, Quantum Science and Technology 6, 035006 (2021).
  17. P. Huembeli and A. Dauphin, Characterizing the loss landscape of variational quantum circuits, Quantum Science and Technology 6, 025011 (2021).
  18. D. S. Franca and R. Garcia-Patron, Limitations of optimization algorithms on noisy quantum devices, arXiv preprint arXiv:2009.05532  (2020).
  19. S. Zhang and W. Cui, Overparametrization in qaoa, Written Report  (2020).
  20. B. T. Kiani, S. Lloyd, and R. Maity, Learning unitaries by gradient descent, arXiv preprint arXiv:2001.11897  (2020).
  21. E. R. Anschuetz, Critical points in hamiltonian agnostic variational quantum algorithms, arXiv preprint arXiv:2109.06957  (2021).
  22. D. D’Alessandro, Introduction to Quantum Control and Dynamics, Chapman & Hall/CRC Applied Mathematics & Nonlinear Science (Taylor & Francis, 2007).
  23. R. Zeier and T. Schulte-Herbrüggen, Symmetry principles in quantum systems theory, Journal of mathematical physics 52, 113510 (2011).
  24. T. Haug, K. Bharti, and M. Kim, Capacity and quantum geometry of parametrized quantum circuits, arXiv preprint arXiv:2102.01659  (2021).
  25. J. Kim, J. Kim, and D. Rosa, Universal effectiveness of high-depth circuits in variational eigenproblems, Physical Review Research 3, 023203 (2021).
  26. D. d’Alessandro, Introduction to quantum control and dynamics (CRC press, 2007).
  27. R. Chakrabarti and H. Rabitz, Quantum control landscapes, International Reviews in Physical Chemistry 26, 671 (2007).
  28. M. Larocca, E. Calzetta, and D. A. Wisniacki, Exploiting landscape geometry to enhance quantum optimal control, Phys. Rev. A 101, 023410 (2020a).
  29. M. Larocca, E. Calzetta, and D. Wisniacki, Fourier compression: A customization method for quantum control protocols, Phys. Rev. A 102, 033108 (2020b).
  30. E. Farhi, J. Goldstone, and S. Gutmann, A quantum approximate optimization algorithm, arXiv preprint arXiv:1411.4028  (2014).
  31. D. Wecker, M. B. Hastings, and M. Troyer, Progress towards practical quantum variational algorithms, Phys. Rev. A 92, 042303 (2015).
  32. M. E. Morales, J. Biamonte, and Z. Zimborás, On the universality of the quantum approximate optimization algorithm, Quantum Information Processing 19, 1 (2020).
  33. S. Sim, P. D. Johnson, and A. Aspuru-Guzik, Expressibility and entangling capability of parameterized quantum circuits for hybrid quantum-classical algorithms, Advanced Quantum Technologies 2, 1900070 (2019).
  34. R. Cheng, Quantum geometric tensor (fubini-study metric) in simple quantum system: A pedagogical introduction, arXiv preprint arXiv:1012.1337  (2010).
  35. J. J. Meyer, Fisher Information in Noisy Intermediate-Scale Quantum Applications, Quantum 5, 539 (2021a).
  36. B. Koczor and S. C. Benjamin, Quantum natural gradient generalised to non-unitary circuits, arXiv preprint arXiv:1912.08660  (2019).
  37. T. Haug and M. Kim, Natural parameterized quantum circuit, arXiv preprint arXiv:2107.14063  (2021).
  38. J. Kim and Y. Oz, Quantum energy landscape and vqa optimization, arXiv preprint arXiv:2107.10166  (2021).
  39. M. Dalgaard, J. Sherson, and F. Motzoi, Predicting quantum dynamical cost landscapes with deep learning, arXiv preprint arXiv:2107.00008  (2021).
  40. K. W. Moore and H. Rabitz, Exploring constrained quantum control landscapes, The Journal of chemical physics 137, 134113 (2012).
  41. M. Larocca, P. M. Poggi, and D. A. Wisniacki, Quantum control landscape for a two-level system near the quantum speed limit, Journal of Physics A: Mathematical and Theoretical 51, 385305 (2018).
  42. P. J. Coles, Seeking quantum advantage for neural networks, Nature Computational Science 1, 389 (2021).
  43. J. Romero, J. P. Olson, and A. Aspuru-Guzik, Quantum autoencoders for efficient compression of quantum data, Quantum Science and Technology 2, 045001 (2017).
  44. C. Bravo-Prieto, D. García-Martín, and J. I. Latorre, Quantum singular value decomposer, Phys. Rev. A 101, 062310 (2020a).
  45. F. T. Chong, D. Franklin, and M. Martonosi, Programming languages and compiler design for realistic quantum hardware, Nature 549, 180 (2017).
  46. N. Chan and M. K. Kwong, Hermitian matrix inequalities and a conjecture, The American Mathematical Monthly 92 (1985).
  47. J. P. Peterson, H. Katiyar, and R. Laflamme, Fast simulation of magnetic field gradients for optimization of pulse sequences, arXiv preprint arXiv:2006.10133  (2020).
  48. M. Larocca and D. Wisniacki, Krylov-subspace approach for the efficient control of quantum many-body dynamics, Physical Review A 103, 023107 (2021).
  49. M. Hsieh, R. Wu, and H. Rabitz, Topology of the quantum control landscape for observables, The Journal of chemical physics 130, 104109 (2009).
  50. T.-S. Ho, J. Dominy, and H. Rabitz, Landscape of unitary transformations in controlled quantum dynamics, Physical Review A 79, 013422 (2009).
  51. J. J. Meyer, Fisher Information in Noisy Intermediate-Scale Quantum Applications, Quantum 5, 539 (2021b).
  52. D. P. Kingma and J. Ba, Adam: A method for stochastic optimization, in Proceedings of the 3rd International Conference on Learning Representations (ICLR) (2015).
  53. A. Mari, T. R. Bromley, and N. Killoran, Estimating the gradient and higher-order derivatives on quantum hardware, Phys. Rev. A 103, 012405 (2021).
Citations (172)

Summary

  • The paper establishes that the dimension of the Lie algebra sets an upper bound on QFIM and Hessian ranks, marking the overparametrization threshold.
  • The paper shows that overparametrization boosts model capacity, triggering a phase transition that simplifies loss landscapes for efficient optimization.
  • The paper validates its framework with simulations on variational eigensolvers, unitary compilation, and quantum autoencoding, confirming its practical implications.

An Analysis of Overparametrization in Quantum Neural Networks

The concept of overparametrization has emerged as a compelling feature in the field of machine learning, prominently assisting in the training and generalization of classical neural networks (NNs). Extending this idea into the quantum domains, recent research has focused on how overparametrization manifests in Quantum Neural Networks (QNNs). The paper in question examines this phenomenon rigorously, providing insights that are foundational to designing scalable QNN architectures capable of potentially achieving quantum advantage.

Overview and Definition of Overparametrization

The paper begins by defining overparametrization in QNNs as the regime where the number of parameters significantly exceeds a critical number, termed McM_c. This allows the QNN to explore all pertinent directions in its state space. The critical number McM_c is shown to be related to the dimension of a Lie algebra derived from the QNN's generators, setting an upper bound for McM_c and the maximal rank attainable by quantum Fisher information and Hessian matrices. This connection ties together algebraic structures with the operational characteristics of QNNs.

Theoretical Framework and Key Results

The research presents a theoretical framework that delineates how the concept of overparametrization aligns with the differential properties of QNN loss landscapes. It proves that certain structural properties of the QNN, such as the dimension of its dynamical Lie algebra, fundamentally constrain the model's rank capabilities in terms of quantum Fisher Information matrices (QFIM) and model capacity.

  1. Rank Upper Bounds: The paper establishes that the dimension of the Lie algebra (denoted as gSg_S when reduced by any symmetries present in the training data) provides an upper bound for the rank of QFIM and Hessian matrices. This bound applies universally across parameter sets and suggests that overparametrization is achieved when the parameter count MM satisfies MgSM \geq g_S.
  2. Model Capacity Link: Overparametrization is shown to be directly associated with model capacity—specifically, the effective quantum dimension. As the QNN becomes overparametrized, its capacity concomitantly reaches the upper saturation defined by the algebraic bounds.
  3. Impact on Loss Landscapes: The transition into overparametrization correlates with a landscape transformation where undesirable local minima diminish, leading to enhanced trainability and convergence rates. This is conceptualized as a computational phase transition, characterized by the loss landscape becoming increasingly favorable for optimization processes.

Practical Implications and Simulations

The theoretical assertions are substantiated with empirical evidence obtained from simulating QNNs across several tasks including the Variational Quantum Eigensolver, unitary compilation, and quantum autoencoding. The simulations demonstrate the practical applicability of the theoretical upper bounds, reflecting a consistent correlation between the onset of overparametrization and improved optimization outcomes.

  1. Variational Quantum Eigensolver: The Hamiltonian variational ansatz is shown to quickly reach saturation in model performance and convergence when the parameter count aligns with the dimension predicted by the dynamical Lie algebra, verifying the computed rank boundaries.
  2. Unitary Compilation and Autoencoding: Simulations using hardware-efficient ansatzes affirm that model performance and QFIM ranks consistently meet theoretical predictions as parameters increase, further suggesting that overparametrization is attainable even for standard quantum computing tasks.

Conclusion

The paper's findings deepen the understanding of QNN behavior in high-dimensional parameter spaces, akin to classical networks but distinguished by their quantum mechanical nature. By correlating overparametrization with algebraic properties of QNNs, the research uncovers intrinsic relationships between the architectural design and practical efficacy—paving the way for developing more robust and trainable QNN models.

The exploration of these algebraic properties not only augments the theoretical landscape of QNNs but also suggests pathways for future advances in quantum machine learning, guiding both the development of algorithms and the analysis of quantum landscapes. This foundational work sets a precedent for subsequent investigations into the interplay between quantum mechanics and machine learning paradigms, potentially guiding practical applications in quantum information processing and enhancement.

Youtube Logo Streamline Icon: https://streamlinehq.com