Hybrid Quantum-Classical Convolutional Neural Networks

Updated 4 December 2025

Hybrid QCNNs are architectures that integrate classical convolution with parameterized quantum circuits for nonlinear feature extraction in high-dimensional data.
They employ advanced quantum encoding, pooling strategies, and joint gradient descent to optimize feature maps while mitigating noise effects.
Empirical studies demonstrate improved accuracy, parameter efficiency, and scalability across applications such as image classification, medical imaging, and signal processing.

A hybrid quantum-classical convolutional neural network (QCNN) is an architecture that integrates classical convolutional neural network operations with parameterized quantum circuits (PQCs) in a tightly-coupled, layered end-to-end framework. This hybridization seeks to exploit the high expressibility of quantum circuits for feature extraction and non-classical data transformation, with classical processing handling scalability, non-quantum nonlinearity, and final classification. Hybrid QCNNs have been demonstrated across a variety of domains including image classification, medical imaging, and signal analysis, offering compelling empirical and theoretical evidence for improved discriminative capability, parameter efficiency, and noise resilience under constrained quantum hardware budgets (Dou et al., 2022, Wang et al., 14 Oct 2025, Shen et al., 2021, Li et al., 7 Jan 2025).

1. Fundamental Hybrid QCNN Architecture

The prototypical hybrid QCNN replaces classical convolutional layers with quantum feature extraction (QFE) layers, each composed of three stages: (i) classical-to-quantum encoding unitaries, (ii) a trainable PQC acting as a nonlinear filter, and (iii) quantum measurement, often followed by a classical bias and nonlinearity (Dou et al., 2022). Given a classical input tensor, small spatial patches are locally encoded to n-qubit quantum states by data-dependent rotations: $|x_{i,j}\rangle = U_{\phi}(x_{i,j})\,|0\rangle^{\otimes n},\quad U_{\phi}(x)=\bigotimes_{i=1}^n R_{y}\left(\phi(x_i)\right).$ These are processed patchwise by sliding the PQC filter over the input, analogous to classical weight-sharing, with the quantum circuit parameters ( $\theta$ ) replacing conventional filter weights. Measurement yields local feature maps, and multiple such quantum filters can be stacked and composed with classical pooling or fully-connected layers. The complete layer is highly expressive, implementing a map from a classical patch to a nonlinear function in a $2^n$ -dimensional Hilbert space (Dou et al., 2022, Tomal et al., 21 Oct 2024, Liu et al., 2019).

Modern hybrids further integrate advanced classical feature extractors—e.g., Vision Transformers for dimension reduction and robust embedding—upstream of the quantum block, or distribute QCNN computation across sliced subcircuits to match hardware constraints (Wang et al., 14 Oct 2025, Li et al., 7 Jan 2025). QCNN-based networks generalize naturally to higher-dimensional data (e.g. 3D CT volumes), time-series signals, and multi-spectral inputs.

2. Quantum Convolutional Filter Design and Data Encoding

The central PQC filter is typically a sequence of alternating layers comprising parameterized single-qubit rotations and entangling gates, such as CNOT ladders, ring CZs, or more complex designs derived from ansatz search (Dou et al., 2022, Yurtseven, 29 Nov 2025, Wang et al., 14 Oct 2025). Example QAOA-like ansatz for an n-qubit filter: $U(\theta)=\prod_{\ell=1}^{L}\left[\bigotimes_{i=1}^n R_{y,i}(\theta_{i}^{(\ell)})\cdot\prod_{i=1}^{n-1} \mathrm{CNOT}_{i,i+1}\right].$ The encoding strategy is crucial for expressiveness and trainability. Common approaches include:

Angle encoding: $R_y(\phi(x_i))$ or richer RX- and RZ-based schemes, mapping real-valued data into qubit rotations (Tomal et al., 21 Oct 2024, Yurtseven, 29 Nov 2025).
Amplitude encoding: for N-dimensional classical vectors, encode into superpositions $|\psi(x)\rangle = \frac{1}{\|x\|}\sum_{i=0}^{N-1} x_i\,|i\rangle$ on $\log N$ qubits (Wang et al., 14 Oct 2025, Shen et al., 2021).
Higher-order entangled encoding: involves pre-entanglement (CNOT, CZ) among qubits before parameterized layers to embed input correlations (Matic et al., 2022).
Patch-based encoding: sliding local window-to-qubit mapping for spatial data or direct compressed representation for 1D signals (Torabi et al., 4 Nov 2025).

Feature extraction is finalized by measuring expectation values, often of Pauli- $Z$ (or $X$ ) operators, yielding scalars or vectors suitable for downstream classical processing.

3. Hybrid Quantum-Classical Computation and Training

Hybrid QCNNs are optimized using joint quantum-classical gradient descent. The full (supervised) loss, typically cross-entropy over softmax logits, admits gradients with respect to both the classical network weights and the PQC parameters. Gradients of quantum circuit outputs with respect to $\theta$ are computed via the parameter-shift rule: $\frac{\partial}{\partial \theta_k}\langle M \rangle = \frac{1}{2}\left[ \langle M \rangle_{\theta_k+\frac{\pi}{2}} - \langle M \rangle_{\theta_k-\frac{\pi}{2}} \right].$ Backpropagation treats PQC observables as "macro-nodes," with the optimizer (e.g., Adam) updating all variables in tandem (Dou et al., 2022, Li et al., 7 Jan 2025). Efficient pipeline execution often leverages batched parameter-shift evaluations, while on NISQ hardware, measurement shot noise and decoherence are mitigated by shallow circuits, parameter sharing, pooling by partial trace, and distributed circuit splitting (e.g., Pauli-basis cut for resource reduction) (Li et al., 7 Jan 2025).

Pooling is typically adapted from the classical paradigm, either as partial trace (discarding qubits post-convolution), explicit learnable two-qubit "pool" unitaries (e.g., CRZ/CRX), or classical pooling (e.g., max, avg) after measurement (Wang et al., 14 Oct 2025, Dou et al., 2022). Careful measurement and classical post-processing integration are critical—recent work suggests recycling or reusing measured/discarded qubits can significantly improve information utilization and accuracy (Anwar et al., 25 Aug 2025).

4. Expressibility, Complexity, and Hardware Integration

The expressibility of a PQC-based quantum convolution filter is measured by its capacity to approximate Haar-random unitaries—quantified via frame potentials or t-design closeness—as well as average entanglement entropy (Wang et al., 14 Oct 2025, Dou et al., 2022). Highly expressive ansatzes exhibit superior discriminative power and lower training cost, but are prone to barren plateaus in deep circuits or high qubit count regimes. Empirical studies indicate best performance from ansatzes with uniform, midrange Schmidt–von Neumann entropy across two-qubit gates, and modest layer depths (e.g., $L\leq 3$ ) (Wang et al., 14 Oct 2025).

Complexity scales favorably when compared to classical filters: a classical convolution filter of size $f\times f \times C_{in}$ has $f^2C_{in}$ weights, whereas an n-qubit quantum filter with L layers has $nL$ parameters, but operates over a $2^n$ -dimensional space, affording large non-local, nonlinear expressivity per parameter (Dou et al., 2022).

Hardware tractability on NISQ devices is emphasized by:

Patch/local encoding (reducing the required active qubits at any time).
Circuit splitting (Pauli cut, subcircuit recombination) to operate large logical QCNNs over minimal hardware resources (Li et al., 7 Jan 2025).
Gate-efficient ansatzes using crosstalk-minimizing layouts (e.g., ring or ladder CNOTs, controlled-rotation pooling).
Parameter sharing and pooling to alleviate the quantum resource bottleneck.

QCNN circuit depth, gate count, error per layer, and measurement shot count are key metrics in real-device deployments (Röseler et al., 9 May 2025, Yurtseven, 29 Nov 2025).

5. Empirical Performance and Application Domains

Hybrid QCNNs have demonstrated competitive or superior accuracy to parameter-matched classical CNNs across a spectrum of data modalities:

On MNIST, a two-block hybrid QCNN achieved 3.1% test-set error matching small classical CNNs with minimal ( $n\leq 3$ ) qubits per filter; increasing PQC expressibility systematically improves performance (Dou et al., 2022).
In high-dimensional color image classification (CIFAR-10), integrating a ViT feature compressor with a QCNN layer resulted in 99.77% accuracy—classical models with the same parameter count dropped by 29.36% (Wang et al., 14 Oct 2025).
Medical imaging tasks (multi-class skin lesion, chest x-ray, breast tumor) demonstrated improved accuracy and 10× parameter reduction versus deep classical networks by integrating distributed QCNN modules—circuit splitting allowed 8-qubit logical models to run on 5-qubit hardware (Li et al., 7 Jan 2025, Yurtseven, 29 Nov 2025).
Signal processing applications (PCG classification) using three-layer QCNNs compress full-scale time-frequency data into an 8-qubit representation, achieving >93% accuracy (Torabi et al., 4 Nov 2025).

Noise resilience is evident, for example, under amplitude damping channels, where certain QCNN ansatzes exhibited +2.71% accuracy gains due to implicit regularization (Wang et al., 14 Oct 2025). QCNNs have also been successfully co-trained with classical or quantum feature fusion heads, delivering statistically significant improvements validated by non-parametric hypothesis testing (Wilcoxon signed-rank, Cohen's d >2) (Yurtseven, 29 Nov 2025), and have been deployed on large (49-qubit) real quantum hardware surpassing classical accuracy under identical learning conditions (Röseler et al., 9 May 2025).

The methodology generalizes to many data types: from 2D and 3D volumetric medical images (Matic et al., 2022, Li et al., 7 Jan 2025), to biological signals (Torabi et al., 4 Nov 2025), and multiclass natural images with hybrid transfer learning (Shi et al., 2023). Key implementation recommendations include optimal encoding (amplitude or midrange angle), regularization by pooling/trace, classical–quantum parameter co-design, and shallow-depth, expressive ansatzes.

6. Research Directions and Open Challenges

Hybrid QCNNs introduce a series of open research directions:

Scaling QCNNs to deeper, wider (more qubits) architectures without loss of trainability, considering the barren plateau effect (Shi et al., 2023).
Exploring circuit composition and feature fusion, particularly the fusion of multiple quantum embeddings or reusing measured/discarded qubit streams for enhanced gradient signals (Anwar et al., 25 Aug 2025).
Extending hybrid quantum-classical methods beyond CNN topologies, e.g., integrating quantum residual blocks, quantum Fourier/other spectral convolutional layers, or quantum pooling strategies tailored to data-rich applications (Shen et al., 2021, Shi et al., 2023, Ng et al., 4 Jun 2024).
Benchmarking under hardware noise through error-mitigated training, large-scale parameter sweeps for ansatz/encoding optimizations, and standardizing empirical evaluation (accuracy, convergence, noise robustness, generalization gap).
Investigating quantum–classical co-design heuristics—e.g., synergy between classical compression (ViT, ResNet) and quantum non-local fusion, entanglement-aware PQC engineering, or circuit splitting/cutting for scalable deployment on limited-qubit devices.

Continued progress in QCNN research will depend on both improved quantum hardware and deeper algorithmic understanding of quantum feature maps, expressible circuit classes, and optimal division of labor between quantum and classical layers.