Theory of overparametrization in quantum neural networks (2109.11676v1)

Published 23 Sep 2021 in quant-ph, cs.LG, and stat.ML

Abstract: The prospect of achieving quantum advantage with Quantum Neural Networks (QNNs) is exciting. Understanding how QNN properties (e.g., the number of parameters $M$) affect the loss landscape is crucial to the design of scalable QNN architectures. Here, we rigorously analyze the overparametrization phenomenon in QNNs with periodic structure. We define overparametrization as the regime where the QNN has more than a critical number of parameters $M_c$ that allows it to explore all relevant directions in state space. Our main results show that the dimension of the Lie algebra obtained from the generators of the QNN is an upper bound for $M_c$, and for the maximal rank that the quantum Fisher information and Hessian matrices can reach. Underparametrized QNNs have spurious local minima in the loss landscape that start disappearing when $M\geq M_c$. Thus, the overparametrization onset corresponds to a computational phase transition where the QNN trainability is greatly improved by a more favorable landscape. We then connect the notion of overparametrization to the QNN capacity, so that when a QNN is overparametrized, its capacity achieves its maximum possible value. We run numerical simulations for eigensolver, compilation, and autoencoding applications to showcase the overparametrization computational phase transition. We note that our results also apply to variational quantum algorithms and quantum optimal control.

References (53)

Citations (172)

View on Semantic Scholar

Summary

The paper establishes that the dimension of the Lie algebra sets an upper bound on QFIM and Hessian ranks, marking the overparametrization threshold.
The paper shows that overparametrization boosts model capacity, triggering a phase transition that simplifies loss landscapes for efficient optimization.
The paper validates its framework with simulations on variational eigensolvers, unitary compilation, and quantum autoencoding, confirming its practical implications.

An Analysis of Overparametrization in Quantum Neural Networks

The concept of overparametrization has emerged as a compelling feature in the field of machine learning, prominently assisting in the training and generalization of classical neural networks (NNs). Extending this idea into the quantum domains, recent research has focused on how overparametrization manifests in Quantum Neural Networks (QNNs). The paper in question examines this phenomenon rigorously, providing insights that are foundational to designing scalable QNN architectures capable of potentially achieving quantum advantage.

Overview and Definition of Overparametrization

The paper begins by defining overparametrization in QNNs as the regime where the number of parameters significantly exceeds a critical number, termed $M_c$ . This allows the QNN to explore all pertinent directions in its state space. The critical number $M_c$ is shown to be related to the dimension of a Lie algebra derived from the QNN's generators, setting an upper bound for $M_c$ and the maximal rank attainable by quantum Fisher information and Hessian matrices. This connection ties together algebraic structures with the operational characteristics of QNNs.

Theoretical Framework and Key Results

The research presents a theoretical framework that delineates how the concept of overparametrization aligns with the differential properties of QNN loss landscapes. It proves that certain structural properties of the QNN, such as the dimension of its dynamical Lie algebra, fundamentally constrain the model's rank capabilities in terms of quantum Fisher Information matrices (QFIM) and model capacity.

Rank Upper Bounds: The paper establishes that the dimension of the Lie algebra (denoted as $g_S$ when reduced by any symmetries present in the training data) provides an upper bound for the rank of QFIM and Hessian matrices. This bound applies universally across parameter sets and suggests that overparametrization is achieved when the parameter count $M$ satisfies $M \geq g_S$ .
Model Capacity Link: Overparametrization is shown to be directly associated with model capacity—specifically, the effective quantum dimension. As the QNN becomes overparametrized, its capacity concomitantly reaches the upper saturation defined by the algebraic bounds.
Impact on Loss Landscapes: The transition into overparametrization correlates with a landscape transformation where undesirable local minima diminish, leading to enhanced trainability and convergence rates. This is conceptualized as a computational phase transition, characterized by the loss landscape becoming increasingly favorable for optimization processes.

Practical Implications and Simulations

The theoretical assertions are substantiated with empirical evidence obtained from simulating QNNs across several tasks including the Variational Quantum Eigensolver, unitary compilation, and quantum autoencoding. The simulations demonstrate the practical applicability of the theoretical upper bounds, reflecting a consistent correlation between the onset of overparametrization and improved optimization outcomes.

Variational Quantum Eigensolver: The Hamiltonian variational ansatz is shown to quickly reach saturation in model performance and convergence when the parameter count aligns with the dimension predicted by the dynamical Lie algebra, verifying the computed rank boundaries.
Unitary Compilation and Autoencoding: Simulations using hardware-efficient ansatzes affirm that model performance and QFIM ranks consistently meet theoretical predictions as parameters increase, further suggesting that overparametrization is attainable even for standard quantum computing tasks.

Conclusion

The paper's findings deepen the understanding of QNN behavior in high-dimensional parameter spaces, akin to classical networks but distinguished by their quantum mechanical nature. By correlating overparametrization with algebraic properties of QNNs, the research uncovers intrinsic relationships between the architectural design and practical efficacy—paving the way for developing more robust and trainable QNN models.

The exploration of these algebraic properties not only augments the theoretical landscape of QNNs but also suggests pathways for future advances in quantum machine learning, guiding both the development of algorithms and the analysis of quantum landscapes. This foundational work sets a precedent for subsequent investigations into the interplay between quantum mechanics and machine learning paradigms, potentially guiding practical applications in quantum information processing and enhancement.

PDF Markdown

YouTube

Show All Videos