Noise-Induced Barren Plateaus in Quantum Circuits

Updated 3 July 2026

Noise-Induced Barren Plateaus (NIBP) are scenarios where noise accumulation causes an exponential decay in gradient variance, severely limiting the trainability of variational quantum algorithms.
The manifestation of NIBP depends on the interplay between noise type (unital vs non-unital), circuit connectivity, and cost function locality, with global observables exhibiting rapid gradient decay.
Mitigation strategies such as engineered dissipation, non-unitary ansätze, and local pre-training can preserve finite gradients in local cost functions, enabling scalable circuit design under non-unital noise.

Noise-Induced Barren Plateaus (NIBP) are a fundamental limitation to the scalability and trainability of variational quantum algorithms (VQAs) operating on noisy, pre-fault-tolerant hardware. NIBPs refer to the phenomenon where gradients of the cost function vanish exponentially due to noise accumulation, rendering classical parameter optimization intractable regardless of circuit initialization or ansatz structure. The interplay between noise model (especially unital vs non-unital channels), cost function locality, circuit connectivity, and algorithm design determines whether and when NIBPs manifest and whether they can be mitigated. Recent advances have revealed nuanced distinctions between noise types, effective circuit depth, and strategies for both diagnosing and overcoming NIBPs.

1. Theoretical Foundations: Definitions and Noise Models

Consider a parameterized quantum circuit with parameters $\theta=(\theta_1,\ldots,\theta_m)$ and cost function $C(\theta)=\operatorname{Tr}[H\,\Phi_\theta(\rho_0)]$ , where $\Phi_\theta$ represents the full, possibly noisy evolution. A barren plateau occurs when the gradient variance, averaged over random parameter choices,

$\mathbb{E}_\theta[\|\nabla C\|_2^2] = O(e^{-\Omega(n)})$

is exponentially suppressed in the number of qubits $n$ (or, equivalently, $\,\mathrm{Var}_\theta[\partial_i C]=E_\theta[(\partial_i C)^2]$ is exponentially small for each parameter $i$ ).

Noise-Induced Barren Plateaus (NIBP) specifically arise when the decay is induced by noise channels rather than ansatz expressivity or cost function structure. For general noise, including local Pauli, depolarizing, or more general completely positive trace-preserving (CPTP) maps, the circuit is modeled as layers of local gates, each followed by a noise channel $N$ , described in a "normal form" as

$N[(I + w\cdot\sigma)/2]= I/2 + \frac12 (t + D w)\cdot\sigma,$

with $t$ (affine shift) and $C(\theta)=\operatorname{Tr}[H\,\Phi_\theta(\rho_0)]$ 0 (contraction) determining unital ( $C(\theta)=\operatorname{Tr}[H\,\Phi_\theta(\rho_0)]$ 1, e.g., depolarizing) and non-unital ( $C(\theta)=\operatorname{Tr}[H\,\Phi_\theta(\rho_0)]$ 2, e.g., amplitude damping) regimes (Mele et al., 2024).

Unital noise channels (e.g., depolarizing, dephasing) preserve the maximally mixed state and exponentially drive all cost function gradients—and expectation values for traceless observables—toward zero as depth increases (Singkanipa et al., 2024, Wang et al., 2020, Larocca et al., 2024). Non-unital channels (e.g., amplitude damping, reset) have a nontrivial affine term ( $C(\theta)=\operatorname{Tr}[H\,\Phi_\theta(\rho_0)]$ 3), which can preserve non-zero gradients in certain regimes.

2. NIBP Mechanisms and Gradient-Variance Suppression

Under realistic noise, the gradient variance associated with any trainable parameter decays as

$C(\theta)=\operatorname{Tr}[H\,\Phi_\theta(\rho_0)]$ 4

for a Pauli observable $C(\theta)=\operatorname{Tr}[H\,\Phi_\theta(\rho_0)]$ 5 of weight $C(\theta)=\operatorname{Tr}[H\,\Phi_\theta(\rho_0)]$ 6 in an $C(\theta)=\operatorname{Tr}[H\,\Phi_\theta(\rho_0)]$ 7-layer circuit, with $C(\theta)=\operatorname{Tr}[H\,\Phi_\theta(\rho_0)]$ 8 the layer index and $C(\theta)=\operatorname{Tr}[H\,\Phi_\theta(\rho_0)]$ 9 a contraction factor determined by the noise properties (Mele et al., 2024). When $\Phi_\theta$ 0 is unital, $\Phi_\theta$ 1 is strictly less than 1, and exponential suppression dominates for any global cost function and at all layers:

$\Phi_\theta$ 2

with $\Phi_\theta$ 3 and $\Phi_\theta$ 4 large, implying an NIBP for any circuit whose depth grows with $\Phi_\theta$ 5 (Larocca et al., 2024, Wang et al., 2020, Schumann et al., 2023).

For local cost functions and non-unital noise, this gradient variance is exponentially suppressed except in the last $\Phi_\theta$ 6 layers, where it can remain polynomially large:

$\Phi_\theta$ 7

if $\Phi_\theta$ 8 is non-unital and $\Phi_\theta$ 9 acts within the light cone of $\mathbb{E}_\theta[\|\nabla C\|_2^2] = O(e^{-\Omega(n)})$ 0 (Mele et al., 2024). Thus, for local observables, only the last few layers remain trainable; the rest of the circuit is effectively "frozen" and does not affect the cost landscape (“effective shallowness”).

For global cost observables (e.g., full state infidelity), all gradients vanish exponentially for both unital and non-unital noise, guaranteeing a NIBP irrespective of noise type or ansatz (Mele et al., 2024, Singkanipa et al., 2024).

3. Effective Circuit Shallowness and Classical Simulability

Noise does not merely induce gradient suppression; it also "truncates" the effective quantum circuit depth. More precisely, the expectation of a local Pauli $\mathbb{E}_\theta[\|\nabla C\|_2^2] = O(e^{-\Omega(n)})$ 1 after a depth- $\mathbb{E}_\theta[\|\nabla C\|_2^2] = O(e^{-\Omega(n)})$ 2 circuit under noise is, up to an exponentially small error, determined solely by the final $\mathbb{E}_\theta[\|\nabla C\|_2^2] = O(e^{-\Omega(n)})$ 3 layers:

$\mathbb{E}_\theta[\|\nabla C\|_2^2] = O(e^{-\Omega(n)})$ 4

(Mele et al., 2024). The circuit becomes "effectively shallow" for all observable estimation tasks: for local cost functions and unital or non-unital noise, only the last $\mathbb{E}_\theta[\|\nabla C\|_2^2] = O(e^{-\Omega(n)})$ 5 layers affect measurable outcomes or gradients.

This effective shallowness has a direct impact on classical simulation complexity. By propagating observables backward through only the last $\mathbb{E}_\theta[\|\nabla C\|_2^2] = O(e^{-\Omega(n)})$ 6 layers, classical algorithms can estimate expectation values with runtime polynomial in $\mathbb{E}_\theta[\|\nabla C\|_2^2] = O(e^{-\Omega(n)})$ 7 for 1D circuits and quasi-polynomial for higher-dimensional connectivity, irrespective of total physical depth (Mele et al., 2024). This principle underpins efficient classical simulation and rules out quantum advantage for expectation value estimation under generic noise without error correction.

4. Experimental Observations and Absence of NIBP under Non-Unital Noise

Experimental studies using large-scale superconducting hardware (IBM Falcon and Heron processors, up to 102 qubits) have tested NIBP predictions in regimes dominated by amplitude damping (non-unital, $\mathbb{E}_\theta[\|\nabla C\|_2^2] = O(e^{-\Omega(n)})$ 8 relaxation) noise versus depolarizing (unital) noise (Schmitt et al., 26 Feb 2026). Measurement of the average gradient norm versus circuit runtime using Information Content Landscape Analysis (ICLA) reveals:

Under depolarizing noise, gradient norms decay exponentially to zero within characteristic circuit times, consistent with NIBP theory.
Under amplitude damping, gradient norms saturate to a finite plateau beyond a hardware- and noise-dependent "flattening time" $\mathbb{E}_\theta[\|\nabla C\|_2^2] = O(e^{-\Omega(n)})$ 9. There is no exponential vanishing, and finite gradient magnitudes persist at long circuit depths, even up to $n$ 0 qubits.
The quantum hardware's effective coherence time $n$ 1 is determined by the worst-performing qubits, not by the average $n$ 2 calibration value. Device benchmarking based solely on average metrics may underestimate the onset and severity of trainability bottlenecks (Schmitt et al., 26 Feb 2026).

Classical simulations corroborate experimental findings: NIBPs are observed under depolarizing noise but are absent under amplitude damping, due to the fixed-point structure and residual parameter dependence in the steady state for non-unital channels. Local cost function optimization remains feasible in the presence of non-unital noise.

5. Mitigation and Avoidance of NIBP: Dissipative Algorithms and Engineered Noise

Several strategies have emerged for mitigating or circumventing NIBP:

1. Engineered Dissipation and Non-Unital Circuits:

Embedding dissipative steps or periodic resets within the circuit introduces non-unital channels that can "extract entropy," maintain nonzero gradients, and guarantee trainability even at large depths (Sannia et al., 2023, Zapusek et al., 2 Jul 2025). For example, resetting a fraction $n$ 3 of ancillary qubits via amplitude damping after every $n$ 4 layers delivers a lower bound

$n$ 5

for parameters in the light cone of the observable, eliminating NIBP for local observables. Analytic conditions and numerical evidence confirm scalable trainability with this approach, even as the circuit depth increases (Zapusek et al., 2 Jul 2025).

2. Non-Unitary Variational Ansätze:

Introduced in both mean-field models and realistic quantum chemistry simulations, incorporating jump operators and non-unitary dynamics directly into the variational layers (i.e., variational Lindblad channels) enables preparation of open-system steady states and preservation of trainability under realistic noise (Dowarah et al., 28 May 2026). Analysis of circuit fixed points reveals that while purely unitary, unital circuits converge to the maximally mixed state (and NIBP), non-unitary channels with multiple steady states result in optimization landscapes retaining nontrivial, parameter-dependent structure.

3. Local Pre-Training Strategies:

For applications such as geometric entanglement measurement, sequentially optimizing a series of commuting local cost functions, followed by global cost refinement, can enable escape from noise-induced barren plateaus without needing non-unitary channels (Zambrano et al., 2023).

6. Practical Implications and Remaining Challenges

The discovery and analysis of NIBPs have major implications:

Trainability bounds: In the absence of error correction or engineered dissipation, any variational quantum circuit suffering unital noise will manifest an NIBP when the depth scales with system size. Only circuits with constant/logarithmic depth, or circuits executing with non-unital noise channels, remain trainable for local observables (Mele et al., 2024, Singkanipa et al., 2024).
Limitations for quantum advantage: For algorithms whose outputs are local observable estimations (including many quantum machine learning and quantum chemistry protocols), noise-induced effective shallowness and NIBP preclude scaling advantages without fault tolerance or dissipative engineering. Only shallow circuits or hybrid quantum-classical strategies may remain viable.
Device benchmarking: Reliable prediction of NIBP onset requires statistical analysis of device noise distributions, not mere averages; performance is dominated by the worst-case qubits and coherence times (Schmitt et al., 26 Feb 2026).
Generalization beyond unital noise: Rigorous results confirm that NIBP only occurs generically in circuits with unital noise. Non-unital (Hilbert-Schmidt contractive) noise yields noise-induced limit sets (NILS), in which cost values concentrate within a parameter-dependent interval but do not uniformly collapse gradients to zero (Singkanipa et al., 2024).
Open questions: Effective-depth bounds for worst-case circuits, architectural evasion mechanisms, complexity of sampling from noisy circuits, and connection to measurement-induced phase transitions remain open research areas (Mele et al., 2024).

7. Summary Table: NIBP Behavior by Noise Model

Noise Model	Gradient Variance Decay	Trainability for Local Cost	Global Cost/Observable
Unital (e.g., depol, dephasing)	Exponential in depth/system size	No (\textit{NIBP})	Exponentially flat
Non-unital (e.g., amplitude damping, engineered reset)	No exponential decay in last $n$ 6 layers	Yes (locally, only last layers trainable)	Still untrainable

Table summarizes critical findings from (Mele et al., 2024, Wang et al., 2020, Singkanipa et al., 2024, Schmitt et al., 26 Feb 2026, Zapusek et al., 2 Jul 2025).

References

"Noise-induced shallow circuits and absence of barren plateaus" (Mele et al., 2024)
"Experimental demonstration of the absence of noise-induced barren plateaus using information content landscape analysis" (Schmitt et al., 26 Feb 2026)
"Barren Plateaus in Variational Quantum Computing" (Larocca et al., 2024)
"Mitigating Noise-Induced Barren Plateaus Using a Non-Unitary Ansatz" (Dowarah et al., 28 May 2026)
"Engineered dissipation to mitigate barren plateaus" (Sannia et al., 2023)
"Scaling Quantum Algorithms via Dissipation: Avoiding Barren Plateaus" (Zapusek et al., 2 Jul 2025)
"Emergence of noise-induced barren plateaus in arbitrary layered noise models" (Schumann et al., 2023)
"Avoiding barren plateaus in the variational determination of geometric entanglement" (Zambrano et al., 2023)
"Noise-Induced Barren Plateaus in Variational Quantum Algorithms" (Wang et al., 2020)
"Beyond unital noise in variational quantum algorithms: noise-induced barren plateaus and limit sets" (Singkanipa et al., 2024)