Noisy Intermediate-Scale Quantum Devices

Updated 8 May 2026

NISQ devices are quantum processors with limited qubit counts and inherent noise, operating without full error correction.
They power advanced methods like variational quantum classifiers that embed classical data and optimize parameterized circuits under noise constraints.
Research focuses on optimizing circuit depth, encoding strategies, and noise resilience to harness practical quantum advantages on current hardware.

A variational quantum classifier (VQC) is a hybrid quantum–classical supervised learning architecture that embeds classical data into quantum states via a feature map, applies a trainable parameterized quantum circuit (the ansatz), and post-processes quantum measurement outcomes classically to yield predictions. VQCs exploit high-dimensional representations and quantum entanglement, leveraging optimization of circuit parameters to minimize a classical loss function on training data. Their structure and components are deeply influenced by noise and depth constraints of current noisy intermediate-scale quantum (NISQ) hardware, as well as by the classical–quantum interface for optimization, encoding, and resource control.

1. Formal Structure and Theoretical Principles

A VQC consists of three main stages:

Quantum Feature Map (State Preparation):
- A classical input vector $x\in\mathbb{R}^d$ is embedded into a quantum state $|\phi(x)\rangle$ on $n$ qubits via a parameterized quantum circuit.
- Common feature maps include amplitude encoding (RawFeatureVector), angle-based encoding with single-qubit rotations, or more elaborate nonlinear maps such as those implemented by Qiskit's ZZFeatureMap or custom diagonal unitary circuits (Shahriyar et al., 3 Mar 2025, Yin et al., 7 Jun 2025, Souza et al., 21 May 2025, Sen et al., 2021).
- Amplitude encoding achieves $|\phi(x)\rangle = \sum_{i=0}^{2^n-1} x_i'|i\rangle$ with $\|x'\|_2=1$ , requiring $n=\log_2(d_{pad})$ qubits after zero-padding, and comprises a single, global initialization unitary (Shahriyar et al., 3 Mar 2025).
Variational Ansatz (Trainable Circuit Layers):
- The core of the VQC is a parameterized unitary $V(\theta)$ (the ansatz), classically optimized to minimize empirical loss.
- Standard ansatze include:
  - RealAmplitudes: $r$ repetitions of single-qubit $R_y$ rotations and full entangling layers with CNOTs.
  - EfficientSU2: $r$ repetitions of layers of universal single-qubit $|\phi(x)\rangle$ 0 rotations and full/bespoke entangling meshes (Shahriyar et al., 3 Mar 2025, Souza et al., 21 May 2025).
- Parameterization is linear in both qubit count and layer depth: for RealAmplitudes, $|\phi(x)\rangle$ 1 parameters; for EfficientSU2, $|\phi(x)\rangle$ 2; with circuit depth and gate count scaling accordingly.
Measurement and Classical Prediction:
- Expectation values of single or multiple qubit observables (commonly Pauli-Z on output qubits) are collected:
$|\phi(x)\rangle$ 3

The measurement outcomes are post-processed to yield class probabilities, e.g., $|\phi(x)\rangle$ 4.
Final prediction is usually thresholded at 0.5, i.e., $|\phi(x)\rangle$ 5 for binary tasks (Shahriyar et al., 3 Mar 2025, Souza et al., 21 May 2025, Sen et al., 2021).

2. Data Encoding, Feature Maps, and Resource Efficiency

Quantum feature maps directly influence both resource requirements and expressive power:

Amplitude Encoding: Maps $|\phi(x)\rangle$ 6-dimensional feature vectors into $|\phi(x)\rangle$ 7 qubits, producing quantum states matching the data's amplitude structure. Efficient for high-dimensional data but challenging to implement precisely at hardware level (Shahriyar et al., 3 Mar 2025, Miyahara et al., 2021).
Angle Encoding: Each feature $|\phi(x)\rangle$ 8 is mapped to the angle of a single-qubit rotation $|\phi(x)\rangle$ 9 or composite rotations, yielding $n$ 0 qubits (one per feature), simpler gatewise but less qubit-efficient (Hwang et al., 16 Apr 2025, Yin et al., 7 Jun 2025, Souza et al., 21 May 2025).
Nonlinear/Kernel Feature Maps: Complex maps may include products of features in phase parameters or entangling gates for higher-order interactions (e.g., in Qiskit's ZZFeatureMap, $n$ 1 for $n$ 2) (Souza et al., 21 May 2025).
QRAC and Trainable Discrete Encodings: For discrete/classical features, quantum random access coding (QRAC) encodes $n$ 3 classical bits into $n$ 4 qubits, reducing resource demands while maintaining separability. Trainable encodings further enhance expressivity for difficult Boolean functions and class imbalance (Thumwanit et al., 2021).

3. Variational Ansatz Architectures and Circuit Design

RealAmplitudes Ansatz (common in Qiskit): repeated layers of $n$ 5 on each qubit and full CNOT entanglement (Shahriyar et al., 3 Mar 2025, Souza et al., 21 May 2025).
EfficientSU2 Ansatz: alternates parameterized $n$ 6 rotations with full-graph CNOT entanglement. This is deeper and parameter-rich, generally outperforming shallow ansätze but incurring increased circuit depth, gate counts, and wall times.
Problem-Tailored Variants: Specific tasks (graph data, photonic platforms) may require custom encodings or Mach–Zehnder interferometer networks (An et al., 24 Jan 2025, Lin et al., 2024).
Practical Constraints: Deeper ansätze offer greater representational power but exacerbate noise sensitivity, barren plateaus, and NISQ limitations (Sen et al., 2021, Shahriyar et al., 3 Mar 2025, Yao et al., 2024).

4. Training Methodologies and Optimization

Loss Functions: Binary classification tasks typically use cross-entropy loss:

$n$ 7

Some studies employ mean-squared error or hinge loss (Shahriyar et al., 3 Mar 2025, Sen et al., 2021, Sierra-Sosa et al., 2020).

Parameter Optimization:
- Derivative-Free Methods: COBYLA, SPSA, and genetic algorithms are frequently used given the stochastic measurement and noise, as well as their robustness to barren plateaus in parameter landscapes. COBYLA is prevalent in simulated studies (Shahriyar et al., 3 Mar 2025, Yin et al., 7 Jun 2025, 2504.10073, Lin et al., 2024).
- Gradient Estimation: When gradients are computed, the parameter-shift rule is the method of choice, as it is exact for gates with Pauli generators and does not rely on finite differences (Hwang et al., 16 Apr 2025, Souza et al., 21 May 2025, Sen et al., 2021). Mini-batches and stochastic or full-batch updates are used, depending on data size and hardware constraints.
Resource Considerations: Simulation and training times scale superlinearly in both dataset size and circuit depth (number of repetitions), especially when amplitude encoding is used or circuit depth exceeds classical simulation capacity (Shahriyar et al., 3 Mar 2025, Yin et al., 7 Jun 2025). Experiments with large datasets (e.g., 960 vs. 640 training samples) observe tripling of wall time with $n$ 8 increasing from 3 to 4 (Shahriyar et al., 3 Mar 2025).

5. Empirical Performance, Limitations, and Scaling Behavior

Metrics: Performance is typically assessed via accuracy, precision, recall, F1 scores, macro-averaged F1, AUC, and MCC (Matthews Correlation Coefficient) for class imbalance. Example: For phishing detection, PhishVQC with RealAmplitude/ EfficientSU2 achieves macro F1 up to 0.89—a 22% improvement over previous VQC studies in the same domain (Shahriyar et al., 3 Mar 2025).
Dataset and Circuit Impact: Increasing the number of circuit repetitions and dataset size augments accuracy but leads to superlinear increases in wall time. Different ansätze (EfficientSU2 vs. RealAmplitudes) display a 2× difference in execution cost for similar accuracy (Shahriyar et al., 3 Mar 2025).
Generalization and Noise Robustness: VQC models exhibit strong generalization for small/moderate data sizes (converging within 300 samples on certain physics datasets), and in simulation, their decision boundaries are robust to moderate NISQ-style noise (accuracy drop ≤ 3%) (Yin et al., 7 Jun 2025).
Limitations: Barren plateaus, class imbalance (especially low MCC), and simulation/hardware bottlenecks are pervasive challenges. Expressivity may be insufficient for under-parameterized circuits; deeper circuits tend to suffer from vanishing gradients (Hwang et al., 16 Apr 2025, Sen et al., 2021).
Comparison to Classical Methods: VQC often achieves accuracy comparable to or slightly higher than classical artificial neural networks and support vector machines, particularly as data complexity or size increases, but at the cost of substantially higher simulation/training time in noisy or high-parameter regimes (Yin et al., 7 Jun 2025, Shahriyar et al., 3 Mar 2025, Hwang et al., 16 Apr 2025, 2504.10073).

6. Extensions, Hybrid Approaches, and Practical Recommendations

Hybrid and Modular Architectures: VQCs have been effectively combined with classical feature extraction (e.g., tensor networks, hybrid autoencoders) and used as downstream nonlinear modules in quantum-inspired geometric and ensemble models (Chen et al., 2020, Maragkopoulos et al., 2024, Mohanty et al., 2 Apr 2026).
Hardware-Efficient and Resource-Aware Designs: Shallow ansatz depth ( $n$ 9– $|\phi(x)\rangle = \sum_{i=0}^{2^n-1} x_i'|i\rangle$ 0), minimal qubit count (amplitude or QRAC encoding), hardware-efficient entanglement, and error/tolerance aware optimization are all recommended for practical NISQ deployment (Shahriyar et al., 3 Mar 2025, Yao et al., 2024, Ptáček et al., 12 Nov 2025).
Measurement Strategies and Resource Economy: Unambiguous quantum classifiers employ three-outcome POVMs with "I don't know" responses and repeat-until-accept schemes, trading slight decreases in accuracy for orders-of-magnitude reduction in quantum circuit executions (averaging only $|\phi(x)\rangle = \sum_{i=0}^{2^n-1} x_i'|i\rangle$ 1– $|\phi(x)\rangle = \sum_{i=0}^{2^n-1} x_i'|i\rangle$ 2 shots per input) (Ptáček et al., 12 Nov 2025).
Ensemble and Voting Methods: Plurality voting across multiple VQC models, potentially distributed across distinct hardware backends, significantly improves performance and noise resilience on real quantum hardware, surpassing both single-VQC and average-aggregation ensembles (Qin et al., 2022).

7. Practical Impact, Open Challenges, and Future Directions

VQCs demonstrate competitive or superior accuracy compared to classical models in domains where data is scarce, structure is nontrivial, or quantum-inspired feature space brings indirect benefits. However:

Achieving scalable, resource-efficient, and noise-resilient learning remains a principal challenge. Shallow circuits, optimized encoding, and NISQ-awareness are essential for near-term practical applications.
Performance continues to bottleneck in the face of noisy measurements, barren plateaus, and the growing computational cost of simulating or executing deep circuits.
Future avenues include integration with error-mitigated or error-corrected quantum hardware, implementation of more expressive feature maps or problem-inspired ansätze, development of quantum-aware optimizers, and deployment in hybrid classical–quantum pipelines for domain-specific learning tasks (Shahriyar et al., 3 Mar 2025, Yin et al., 7 Jun 2025, Maragkopoulos et al., 2024, Mohanty et al., 2 Apr 2026).
Realizing quantum advantage with VQCs, beyond classical kernel analogs, depends on the design of feature maps and variational layers that access classically hard-to-invert Hilbert space regions, while maintaining tractable optimization (Miyahara et al., 2021, Sen et al., 2021).

VQC technology thus remains a focus of QML research, providing a flexible and tunable class of quantum models at the intersection of variational optimization and quantum state discrimination (Shahriyar et al., 3 Mar 2025, Yin et al., 7 Jun 2025, Sen et al., 2021).