XEB Fidelity: Benchmarking Quantum Circuits

Updated 23 August 2025

XEB fidelity is a protocol that compares experimental output distributions with ideal ones from quantum circuits, serving as a key metric for quantum supremacy certification.
It employs the linear XEB formula to estimate state fidelity in the low-noise regime, highlighting the impact of noise and decoherence on circuit performance.
Despite its effectiveness, XEB fidelity faces challenges such as noise-induced phase transitions and classical spoofing, necessitating advanced simulation algorithms and circuit designs.

Cross-Entropy Benchmarking (XEB) Fidelity is a statistical protocol designed to quantify the agreement between experimentally sampled output distributions from quantum circuits and the ideal distributions predicted by quantum theory. Originating as a key metric for certifying random circuit sampling experiments and supporting claims of quantum computational supremacy, XEB fidelity defines the extent to which a quantum device produces “heavy” outcomes—those with higher-than-uniform probabilities—relative to the complex, non-classical distribution expected from a random quantum circuit. The theoretical and practical subtleties of XEB fidelity relate to its conditional classical hardness, sensitivity to noise and error model, potential for classical spoofing, and its role as a fidelity estimator and diagnostic in both device and gate-level characterization.

1. Mathematical and Operational Definition

Let $C$ be a quantum circuit on $n$ qubits, and let $q_C(x) = |\langle x|C|0^n\rangle|^2$ be the ideal output probability for bitstring $x\in\{0,1\}^n$ . The “linear XEB” fidelity is defined for a distribution $p$ (typically from experimental or simulated measurement outcomes) as

$\mathcal{F}_{C}(p) = 2^n \mathbb{E}_{x \sim p} q_C(x) - 1,$

where $\mathbb{E}_{x \sim p}$ indicates the average over samples $x\sim p$ (Aaronson et al., 2019, Barak et al., 2020).

In practice, for $k$ samples $z_1,\dots,z_k$ drawn from the device,

$\mathrm{XEB} = \frac{2^n}{k} \sum_{i=1}^k q_C(z_i) - 1.$

A trivial simulator (such as uniform random sampling) yields $\mathcal{F}_C(p) = 0$ , while an ideal (noise-free) quantum device yields $\mathcal{F}_C(p) = 1$ or slightly less in the presence of noise (Barak et al., 2020, Kasirajan et al., 2024). Intermediate values indicate partial agreement.

2. Theoretical Foundations and Fidelity Estimation

The fundamental idea is that output bitstrings from random quantum circuits have a heavy-tailed distribution (Porter–Thomas), which is believed to be hard to reproduce classically. A quantum device that samples from a genuinely chaotic circuit will yield bitstrings $x$ such that the empirical average of $q_C(x)$ is significantly larger than $1/2^n$ , the value from uniform random guessing.

For an ideal device, the expected XEB value equals the state fidelity, i.e., the probability that the noise-free state $|\psi\rangle$ matches the noisy experimental output $\rho$ : $\mathcal{F}_{\text{state}} = \langle \psi | \rho | \psi \rangle.$ In the low-noise regime ( $\varepsilon n\ll 1$ , with $\varepsilon$ the per-qubit error rate and $n$ the number of qubits), XEB approximates the fidelity, and their respective exponential decays with circuit depth match quantitatively (Ware et al., 2023, Gao et al., 2021). For stronger noise, this correspondence breaks down sharply at a critical value (see §4).

3. Hardness, Spoofing, and Complexity Assumptions

The classical hardness of spoofing XEB hinges on the Cross-Entropy Quantum Threshold (XQUATH) assumption: that no polynomial-time classical algorithm can estimate $q_C(x)$ for random circuits to mean-squared error better than $1/2^n$ (the trivial estimator). If XQUATH holds, then no classical algorithm can systematically produce samples with XEB values nontrivially above zero (Aaronson et al., 2019). The reduction exploits the fact that any algorithm passing the XEB test better than the trivial bound would enable more accurate amplitude estimation than XQUATH allows.

However, for circuits of sublinear depth, XQUATH fails: efficient tensor-network or marginal-based classical algorithms can produce samples with nontrivial XEB values, thereby “spoofing” the benchmark (Barak et al., 2020). More generally, deliberate circuit partitioning, targeting bright subspaces, or exploitation of local marginals can yield high XEB without true global quantum complexity (Gao et al., 2021, Oh et al., 2022).

In the context of related metrics—e.g., the system linear cross-entropy score (sXES), targeting Hamiltonian simulation—distinct complexity assumptions (such as sXQUATH) have also failed for sublinear-depth circuits (Tanggara et al., 2024).

4. Noise, Phase Transitions, and Fidelity Limits

The XEB fidelity demonstrates a sharp phase transition as a function of the global error rate $\varepsilon N$ (where $N$ is the number of qubits or qudits) (Ware et al., 2023). In the regime $\varepsilon N\ll 1$ , XEB and fidelity decay together as $(1-\varepsilon)^N$ per cycle. Above a critical threshold

$(1-\varepsilon)^N = \Lambda_g,$

with $\Lambda_g$ a spectral gap associated with the circuit’s transfer matrix, the XEB decay “saturates” and ceases to track the fidelity, instead plateauing at a noise-independent value. The critical value depends on both connectivity and the entangling power of the two-qubit gate set (parameterized by $(\alpha, \beta)$ in the transfer matrix formalism); gates with higher entangling power yield higher robustness of XEB as a fidelity proxy. This transition can be framed as an eigenvalue crossing in a corresponding statistical mechanical model (Ware et al., 2023, Morvan et al., 2023).

Deviations of ergodicity, namely the mismatch between the ensemble and instance-wise averages of suitable scheme functions (such as $p(x)^2$ ), serve as a direct indicator of fidelity breakdown in the noisy regime and provide a unifying framework for generalized XEB (Cheng et al., 13 Feb 2025).

5. Practical Implementation: Algorithms and Verification

Calculation of XEB fidelity for large circuits involves simulating output probabilities for experimentally observed bitstrings. Direct brute-force amplitude calculation is intractable for $n\gtrsim50$ , but advanced methods such as multi-tensor contraction algorithms, exploiting memoization and contraction tree optimization, can batch and cache common computational steps, enabling efficient verification up to 16–20 cycles for $n=53$ (Sycamore) (Kalachev et al., 2021).

For further scalability, Clifford XEB replaces Haar-random or fSim circuits with Clifford circuits, exploiting classical simulability of stabilizer states; this enables XEB benchmarking of devices with hundreds to over a thousand qubits, at the cost of some reduction in scrambling properties (Chen et al., 2022).

Particle-number-conserved (U(1) symmetric) circuit ensembles can also drastically reduce simulation complexity—shrinking the relevant Hilbert space from $2^N$ to $\binom{N}{n}$ for fixed $n$ . The modified LXEB (MLXEB) protocol enables large-scale benchmarking under such constraints; fidelity estimation remains valid as long as the noise model remains weak and particle-number is strictly preserved (Kaneda et al., 16 May 2025).

In monitored circuits exhibiting measurement-induced phase transitions (MIPT), XEB can be enhanced via machine learning generative modeling (e.g., recurrent neural networks) to effectively reduce the sample complexity for estimating statistical order parameters (Hu et al., 22 Jan 2025).

6. Experimental Applications and Gate Characterization

XEB is applied both to system-level and gate-level benchmarking:

In large-scale quantum processors, XEB enables the certification of random circuit sampling behavior and can approximate global state fidelity, provided the device operates in the low-noise phase (Morvan et al., 2023, Kasirajan et al., 2024).
High-fidelity gate operations, especially for critical two-qubit (e.g., CZ and CNOT) gates, are certified using XEB by interleaving the target gate within randomized layers and extracting decay rates; achieved values $(99.15 \pm 0.02)\%$ for remote CNOT gates underscore the advancing performance of modular quantum hardware (Ye et al., 2021, Song et al., 2024).
In cavity Qudit systems, XEB serves as an algorithm-agnostic measure for the effective controllability and error rate in implementing Haar-random unitaries; high XEB correlates with successful compilation of large-dimension operations (Bornman et al., 2024).

7. Limitations, Controversies, and Extensions

XEB’s reliability as a fidelity estimator is conditional on several factors:

For uncorrelated, homogeneous, and sufficiently weak noise, XEB approximates state fidelity; under correlated or adversarial error, or circuit partitioning, this correspondence fails (Gao et al., 2021, Oh et al., 2022).
Classical spoofing via computational shortcuts (e.g., light-cone truncation, local-marginal exploitation, or heavy-output post-selection) can produce nontrivial XEB scores even absent true quantum complexity (Barak et al., 2020, Oh et al., 2022).
Additive scaling of XEB contrasts with the inherently multiplicative nature of fidelity (over disjoint subsystems), enabling spoofing strategies based on partitioning (Gao et al., 2021).
The ergodicity framework exposes explicit noise thresholds beyond which XEB ceases to be reliable; deviations from ergodicity serve as a rigorous benchmark for detecting strong noise (Cheng et al., 13 Feb 2025).

Proposed remedies include:

Combining XEB with more robust, possibly nonlinear benchmarks or those measuring total variation distance, albeit at greater experimental cost.
Circuit designs utilizing high-scrambling and high entangling-power gates, and architectures impeding classical partitioning or marginal inference (Ware et al., 2023).
Particle-number-conserving benchmarking and large-scale Clifford-based XEB as scalable alternatives, with care taken about the scope of the underlying symmetry constraints (Chen et al., 2022, Kaneda et al., 16 May 2025).

Summary Table: XEB Fidelity—Key Concepts

Aspect	Description	Primary Reference(s)
Definition	$\mathcal{F}_{C}(p) = 2^{n} \mathbb{E}_{x \sim p} q_C(x) - 1$	(Aaronson et al., 2019, Barak et al., 2020)
Fidelity estimation regime	Valid for $\varepsilon N \ll 1$ , fails sharply above critical noise	(Ware et al., 2023, Morvan et al., 2023)
Hardness assumption	XQUATH (no efficient classical XEB spoofing), refuted for shallow depth	(Aaronson et al., 2019, Barak et al., 2020)
Classical spoofing methods	Local-marginals, Pauli path, heavy-output post-selection	(Barak et al., 2020, Oh et al., 2022)
Noise-induced phase transition	Abrupt breakdown of XEB–fidelity correspondence at $\varepsilon N_c$	(Ware et al., 2023)
Scalable verification techniques	Multi-tensor contraction, Clifford XEB, MLXEB with U(1) conservation	(Kalachev et al., 2021, Chen et al., 2022, Kaneda et al., 16 May 2025)

Cross-Entropy Benchmarking Fidelity is a versatile and experimentally amenable tool for quantum device certification and error diagnosis, but its robust interpretation as a marker of quantum supremacy and state fidelity is contingent on rigorous justification of noise conditions, circuit structure, and resistance to classical spoofing. Both recent theoretical and experimental advances continue to refine and delineate the regimes where XEB is an informative and valid measure of “quantum advantage.”