Quantum Neural Networks: Principles & Applications

Updated 7 September 2025

Quantum Neural Networks are quantum analogues of classical neural networks that use parameterized quantum circuits to harness effects like superposition and entanglement for learning tasks.
They implement layered architectures with quantum perceptrons and utilize gradient-based optimization and backpropagation to update unitary gate parameters efficiently.
Experimental benchmarks show robustness against noise and the absence of barren plateaus, making QNNs suitable for scalable deployment on NISQ devices.

Quantum Neural Networks (QNNs) are the quantum analogue of classical neural networks, designed to harness quantum mechanical effects—such as superposition, entanglement, and measurement-induced nonlinearity—to perform learning tasks in scenarios where either data or computational primitives are inherently quantum. QNNs can be realized as parameterized quantum circuits, where the adjustable parameters on unitary gates play the role of network weights, and measurement outcomes post‑processing the quantum state provide the equivalent of neural activations. Recent research has rigorously analyzed the theoretical capabilities, generalization properties, training principles, and resource requirements of QNNs, as well as proposed scalable architectures and efficient optimization algorithms suitable for both classical simulation and direct execution on near-term quantum devices.

1. Quantum Neuron Definitions and Layered Architectures

QNN architectures are built from quantum generalizations of perceptrons—quantum “neurons”—modeled as arbitrary unitary operators acting jointly on input qubits and ancilla/output qubits. In the most general form, each quantum perceptron is specified as a unitary $U$ operating on $m$ input qubits and one output qubit:

The network is layered, with perceptrons applied in parallel within each layer.
The output of layer $l$ is produced by applying the ordered product of perceptron unitaries generating a global layer unitary, typically not commuting.
After each layer, a partial trace is performed to eliminate the previous layer degrees of freedom, leaving a (possibly mixed) quantum state on the current layer.
The QNN as an entire object implements a quantum channel, which mathematically is a composition of completely positive (CP) maps:

$\rho_{\text{out}} = \mathcal{E}_{\text{out}} \circ \mathcal{E}_L \circ \dots \circ \mathcal{E}_1 (\rho_{\text{in}})$

where each $\mathcal{E}^l$ is a CP map formed by the perceptron layer $l$ . This formalism mirrors the feed-forward structure of classical deep neural networks but is intrinsically quantum in representation and operation (Beer et al., 2019).

2. Training Principles and Quantum Backpropagation

The training objective is the optimization of a quantum cost function—typically, the average fidelity between network output states and target states. Given training pairs $(|\phi^{\text{in}}_x\rangle, |\phi^{\text{out}}_x\rangle)$ , the cost function is:

$C = \frac{1}{N} \sum_{x} \langle \phi^{\text{out}}_x | \rho^{\text{out}}_x | \phi^{\text{out}}_x \rangle$

Training uses a gradient ascent algorithm on the fidelity-based cost, updating each perceptron (parametrized as $U = \exp(iK)$ ) by:

$U \longleftarrow \exp(i\epsilon K) \cdot U$

The gradient calculation exploits the layered structure; by propagating an adjoint state backward (analogous to classical backpropagation), gradients can be computed locally per layer using commutators between current layer states and adjoint states. For a first-layer perceptron, the change in cost under an infinitesimal parameter shift is:

$\Delta C = \frac{i}{N} \sum_x \mathrm{tr}(M_1 K_1 + M_2 K_2)$

The optimal update direction is computed under a norm constraint on $K$ , leading to closed-form update rules (Beer et al., 2019). A crucial feature is that the derivative with respect to network parameters is computable within each layer rather than across the global state.

3. Quantum and Classical Implementations

The framework is designed for both simulation on classical computers and direct implementation on quantum hardware:

Classical Simulation: For few-qubit deployment, standard numerical tools (MATLAB, Mathematica) simulate the system, leveraging the local CP-map structure and per-layer gradient update (Beer et al., 2019).
Quantum Implementation: Quantum subroutines utilize (a) the “swap trick” for fidelity estimation—preparing two copies (network output and pure target), acting with a controlled-SWAP and measurement to extract fidelity, and (b) the physical realization of layer-wise CP maps by initializing ancillas, applying layer unitaries, and tracing out prior layers.

Supported hardware requirements include state initialization in $|0\rangle$ , universal quantum gates (e.g., CNOT, T, H), computational-basis measurement, and classical post-processing for partial trace operations.

Notably, only the “width” (number of qubits per layer) determines resource scaling; the procedure does not require keeping the full global state in memory, allowing deep, scalable QNNs with memory cost independent of total network depth.

4. Scalability, Efficiency, and Resource Overhead

Key aspects of the approach for enabling deep QNNs include:

Scalability: The resource overhead is determined by the largest layer width, not total qubit count. At each optimization step, only the present and adjacent layers need to be co-simulated or co-implemented.
Efficiency on NISQ Devices: The approach is suited for noisy intermediate-scale quantum (NISQ) devices, as circuits are shallow for a fixed width, and repeated execution (for statistical fidelity estimation) is natural on such platforms.
Absence of Barren Plateaus: Empirical and theoretical evidence indicates the absence of exponentially vanishing gradients (“barren plateaus”) in this QNN class—an important distinction from random-circuit variational QNNs (2011.06258).
Robustness: Benchmarking shows remarkable robustness to noise. When a fraction of training data is replaced with random data, performance and generalization remain strong.

5. Benchmarking, Generalization, and Robustness

Performance was evaluated through unitary learning tasks, where the QNN is trained to approximate an unknown unitary $V$ . After training on $n$ of $N$ Haar-random pairs, the average fidelity on the whole set is predicted theoretically as:

$C \approx \frac{n}{N} + \frac{N-n}{N D (D+1)} [ D + \min\{ n^2+1, D^2 \} ]$

where $D$ is the Hilbert space dimension. The authors confirm, via simulation, that the QNN can generalize with a number of training pairs less than the Hilbert space dimension $D$ , and that observed performance closely matches the theoretical optimum.

Additionally, noise robustness is demonstrated by evaluating the network on data sets where a portion of training pairs are replaced by random states. Crucially, the network remained stable and performant, and no evidence of training pathologies such as barren plateaus was found (Beer et al., 2019).

6. Mathematical Formalism and Quantum Subroutines

The QNN formalism is anchored in explicit mathematical constructs:

Network Output:

$\rho_{\text{out}} = \operatorname{tr}_{\text{in, hidden}} \left[ U_{\text{out}} U^L \dots U^1 \left(\rho_{\text{in}} \otimes |0\dots 0 \rangle \langle 0\dots 0|_{\text{hidden, out}} \right) (U^1)^\dagger \dots (U^L)^\dagger (U_{\text{out}})^\dagger \right]$

Cost Function:

$C = \frac{1}{N} \sum_x \langle \phi_x^{\text{out}} | \rho_x^{\text{out}} | \phi_x^{\text{out}} \rangle$

Gradient Update:

$U_j^l(s+\epsilon) = \exp(i\epsilon K_j^l(s)) U_j^l(s)$

with $K_j^l$ determined via the local gradient ascent calculation.

Layer CP-map:

$E^l(X^{(l-1)}) = \operatorname{tr}_{l-1} \left[ U^l \left( X^{(l-1)} \otimes |0\dots 0\rangle\langle 0\dots 0| \right) (U^l)^\dagger \right]$

Swap Trick for Fidelity:

$F(|\phi\rangle, \rho) = 2p_0 - 1,\quad p_0 = \frac{1}{2} + \frac{1}{2} \operatorname{tr}(\operatorname{SWAP} \cdot |\phi\rangle\langle\phi| \otimes \rho)$

These equations ground both the training and the quantum implementation protocol and allow for precise analytic and computational assessment.

7. Practical Implications and Future Extensions

The QNN framework combining quantum perceptrons, layered compositional channels, gradient-based optimization, and efficient measurement routines enables:

Scalable quantum learning applicable to both classical and fully quantum tasks.
Implementation and training on near-term quantum processors, with efficient resource scaling.
General applicability to unknown quantum channel learning, quantum process tomography, and quantum data compression.
Extension to broader QNN variants, including those with explicit nonlinearity, dissipative or recurrent architectures, and hybrid quantum-classical processing.
Absence of observed barren plateaus or exponentially vanishing gradients, providing a significant practical advantage over random-circuit variational QNNs.

This approach is representative of a new class of quantum machine learning models that exploit native quantum mechanical operations for learning, optimization, and generalization in high-dimensional quantum spaces.

PDF Markdown Chat (Pro)

References (2)

Efficient Learning for Deep Quantum Neural Networks (2019)

Toward Trainability of Quantum Neural Networks (2020)

Follow Topic

Get notified by email when new papers are published related to Quantum Neural Networks (QNNs).