Stabilizer Learning in Quantum and Control Systems

Updated 9 October 2025

Stabilizer learning subroutines are algorithmic methods for efficiently extracting the stabilizer structure of quantum states and dynamical systems using group-theoretic and convex optimization techniques.
They enable practical applications such as quantum state tomography, error correction, and stabilizing controller synthesis with proven information-theoretic efficiency.
Advanced extensions integrate bootstrapping, reinforcement learning, and parallelized sampling to address noise and high-dimensional complexities in both quantum and control settings.

Stabilizer learning subroutines are algorithmic methods for efficiently extracting, reconstructing, or exploiting the stabilizer structure of quantum states or dynamical systems in both quantum information and control-theoretic contexts. These subroutines leverage the algebraic properties of stabilizer states in quantum computing or the structural constraints of stabilizable dynamical systems to enable tasks such as quantum state tomography, robust dynamics identification, stabilizing controller synthesis, and resource-efficient state simulation. Such methods typically combine group-theoretic measurements, linear algebraic routines, convex optimization, and, in some settings, advanced sampling or kernel methods.

1. Foundations of Stabilizer Learning

Stabilizer states in quantum information theory are states uniquely specified as common eigenvectors of an abelian subgroup of the $n$ -qubit Pauli group, with applications ranging from quantum error correction to classical simulation (Gottesman–Knill theorem) (Montanaro, 2017). For a stabilizer state $|\psi\rangle$ , the stabilizer group $G \subset \mathcal{P}_n$ (the $n$ -qubit Pauli group) defines all Pauli operators $P$ such that $P|\psi\rangle = |\psi\rangle$ , where $|G| = 2^n$ . This algebraic structure underpins efficient routines for learning, simulating, and classifying quantum states with low “magic.”

In continuous control, “stabilizability” refers to the ability to design a controller such that any trajectory can be robustly stabilized—often formalized via contraction metrics or control-theoretic certificates (Singh et al., 2018, Singh et al., 2019). Learning stabilizable dynamics typically mandates that the underlying system model admits an exponentially stabilizing feedback controller, a property that is enforced through function space constraints or convex relaxations.

Key properties essential to these subroutines include:

Unique characterization of stabilizer states by abelian subgroups of the Pauli group
Efficient encoding of stabilizer groups as binary or tableau representations
Existence of contraction metrics or Lyapunov certificates guaranteeing stabilizability in dynamical systems

2. Quantum Algorithms and Efficient Stabilizer Tomography

The canonical subroutine for learning generic stabilizer states employs collective Bell-basis measurements on pairs of state copies (Montanaro, 2017). The protocol capitalizes on the following principles:

Bell Sampling: A measurement in the Bell basis on two copies of a stabilizer state yields bitstrings $r \in \mathbb{F}_2^{2n}$ that are uniformly distributed over a coset $T \oplus s$ of the unknown stabilizer subgroup $T$ .
Coset Shift Cancellation: By accumulating several outcomes and computing pairwise XORs $r_i \oplus r_0$ , one cancels the random coset shift, obtaining elements distributed uniformly in $T$ itself.
Subspace Recovery: With $2n$ or more such samples, standard techniques (e.g., Gaussian elimination over $\mathbb{F}_2$ ) nearly certainly recover a full basis of $T$ , i.e., the complete stabilizer group.
Sign Determination: Each Pauli generator’s sign is determined by measurement of the state in its eigenbasis.

This subroutine achieves information-theoretic optimality—requiring $O(n)$ copies—and has computational cost $\mathcal{O}(n^3)$ (classical post-processing). The failure probability scales as $2^{-n}$ .

A comparison of subroutine components is given below:

Step	Quantum Resource	Classical Post-Processing
Bell basis measurements	$2n+2$ state copies	None
Coset cancellation	None	Bitwise XORs
Stabilizer reconstruction	None	Gaussian elimination
Sign determination	$n$ measurements	None

Extensions handle “ $t$ -doped” stabilizer states (states from Clifford+ $t$ -gate circuits): an algebraic coset decomposition is constructed and reconstructed via characteristic or Bell-difference sampling (Leone et al., 2023), but the resource cost scales as $\operatorname{poly}(n,2^t)$ .

3. Robustness, Hardness, and Agnostic Learning

Stabilizer learning in the presence of noise is subject to severe computational hardness. In the PAC-learning framework, efficient learning of stabilizer states is possible in the noiseless case (via algebraic, non-SQ routines), but when label/classification or depolarizing noise is present, statistical query (SQ) models require exponentially many queries—even achieving constant error (Gollakota et al., 2021). This hardness is shown by reduction to the “learning parity with noise” (LPN) problem, rendering stabilizer learning in noisy regimes intractable for SQ-based algorithms.

In “agnostic” settings, where the task is to learn the closest stabilizer state up to some fidelity $\tau$ , recursive “bootstrapping” and Bell difference sampling enable protocols with runtime $\operatorname{poly}(n,1/\epsilon) \cdot (1/\tau)^{O(\log(1/\tau))}$ (Chen et al., 13 Aug 2024). For states with high stabilizer dimension (e.g., $t$ -doped stabilizer states), the time scales as $n^3 \cdot (2^t/\tau)^{O(\log(1/\epsilon))}$ , balancing coverage of the search space and efficiency.

4. Stabilizer Learning in Control and Dynamical Systems

In control-theoretic settings, stabilizer learning subroutines identify or enforce structure that ensures the existence of a stabilizing feedback policy. Key developments include:

Control Contraction Metric (CCM) Regularization: The regression task of learning system dynamics is augmented with a contraction-theoretic constraint (Control Contraction Metric), encoded as linear matrix inequalities (LMIs) over sampled state points, enforcing that the learned model admits an exponentially stabilizing controller (Singh et al., 2018, Singh et al., 2019). Functions such as $f(x)$ and $B(x)$ are parameterized using feature mappings (random feature approximations or kernel expansions), and alternating convex optimization between the dynamics and metric parameters is performed. These methods achieve sample efficiency and guarantee robustness, particularly in underactuated and data-scarce regimes.
Joint and Subspace-Based Stabilization: For families of unknown linear systems with shared structure, joint learning-based subroutines use data from multiple (potentially unstable) systems, with randomized feedback and rescaled least squares estimation to quickly estimate parameters and synthesize stabilizing policies. Pooling information reduces the required sample complexity compared to individual system identification (Faradonbeh et al., 2022). For high-dimensional LTI systems, learning the unstable subspace first and then applying policy gradient only on the corresponding reduced system drastically reduces sample complexity (from $\mathcal{O}(d^2)$ to $\mathcal{O}(\ell^2)$ , with $\ell$ the number of unstable modes) and accelerates stabilization (Toso et al., 2 May 2025).
Reinforcement Learning with Stability Guarantees: Hybrid and modular methods embed stability-certifying mechanisms (e.g., Lyapunov functions, adaptive/backup controllers, Youla–Kučera parameterization) into RL pipelines. This includes on-the-fly learning from sampled (digital) data streams, switching to fallback control if the stability constraints are violated, or enforcing structure via neural Lyapunov certificates (Beckenbach et al., 2022, Lawrence et al., 2023, Ganai et al., 2023, Quartz et al., 12 Sep 2024).

5. Sampling, Parallelization, and Computational Scaling

For large-scale simulation and many-body systems, computational bottlenecks are addressed through efficient sampling, encoding, and parallelization:

Biased Pauli Sampling in MPS: The stabilizer group (i.e., all Pauli strings $P$ such that $P|\psi\rangle = \pm|\psi\rangle$ ) of an MPS is recovered via chain-rule factorized sampling and environment tensor contraction; partial Pauli strings with low probability are pruned at each step, and independent generators are recovered by post hoc Gaussian elimination over binary tableaux (Lami et al., 29 Jan 2024). The scaling is favorable, $\mathcal{O}(N \chi^3)$ per iteration.
Parallelized Stabilizer Simulation: Deep quantum circuits, as encountered in QML, can be simulated more efficiently by group-wise encoding of Clifford and non-Clifford gates, formation of precomputed lookup tables for single- and two-qubit gate actions, and parallel mapping of operator blocks (Hai et al., 15 Feb 2025). This approach improves simulation speed (e.g., a factor of $4.23\times$ over state-vector methods for 4-qubit, 60k-gate circuits) in the low-qubit, deep-circuit regime typical of quantum machine learning.

6. Extensions: Structure Learning and Robust Decomposition

Recent advances have established that quantum states with nontrivial “stabilizer extent,” i.e., low stabilizer rank or close stabilizer decompositions, admit efficient subroutines for reconstructing the dominant stabilizer structure. Using inverse theorems for higher-order uniformity norms (such as the Gowers $U^3$ norm), any function (state) with sufficiently high uniformity admits a decomposition into a structured (stabilizer) component plus a small error term. Algorithmic self-correction—efficient under the algorithmic polynomial Frieman–Ruzsa (APFR) conjecture—permits polynomial- or quasipolynomial-time identification of an explicit stabilizer decomposition for any $|\psi\rangle$ with bounded stabilizer extent, using access to the state preparation unitary and its controlled version (Arunachalam et al., 7 Oct 2025).

The table below summarizes representative stabilizer learning subroutines and their regimes:

Subroutine	Regime/Target State	Resource Scaling	Noise Sensitivity
Bell sampling & coset cancellation	Pure stabilizer states	$\mathcal{O}(n)$ samples, $\mathcal{O}(n^3)$ time	Low
Coset-based t-doped learning	$t$ -doped stabilizers	$\mathcal{O}(\mathrm{poly}(n,2^t))$	Low (when $t$ small)
Agnostic bootstrapping	States with high stabilizer fidelity	$\mathrm{poly}(n,1/\epsilon)$ , quasipoly $(1/\tau)$	Moderate/High (depends on $\tau$ )
Joint learning, unstable subspace	Linear systems, high-dim	$\mathcal{O}(\ell^2)$ for $\ell \ll d$	Robust (with appropriate noise handling)
MPS chain-rule sampling	MPS or T-doped MPS	$\mathcal{O}(N\chi^3)$ per sweep	-
Parallelized stabilizer simulation	Deep, low-qubit QML PQCs	$O((K+K') n 4^n)$	-

7. Practical Impact and Limitations

Stabilizer learning subroutines are essential for both quantum information (efficient tomography, error correction, structural understanding of quantum states) and cyber-physical systems (robust learning and control). Their primary advantages are information-theoretic efficiency, reliability guarantees, and explicit utilization of symmetry or stability certificates.

However, limitations persist:

Subroutines exploiting explicit structure (e.g., Bell measurement on pure stabilizers) can be highly sensitive to noise. Statistical query-based methods become intractable for stabilizer learning under noise (Gollakota et al., 2021).
The complexity of learning increases exponentially with the stabilizer “magic” (extent, $t$ -doping, low fidelity), constraining efficient application to states with large stabilizer dimension (Leone et al., 2023, Chen et al., 13 Aug 2024, Arunachalam et al., 7 Oct 2025).
High-dimensional or compositional system identification benefits from joint/subspace-focused routines, but these rely on accurate estimation of subspaces or shared structure and may be affected by non-diagonalizability or coupling in practical implementations (Toso et al., 2 May 2025).

In summary, stabilizer learning subroutines represent a family of algorithmic methods specialized for extracting, reconstructing, or leveraging stabilizer structure in both quantum and control-theoretic domains. Their mathematical foundation—group-theoretic, algebraic, and convex-analytic—enables robust, efficient solutions provided that the underlying structure is present and that noise or “magic” content does not render the problem intractable. Continued refinement of these subroutines, including robustification against noise and the development of efficient hybrid quantum-classical routines, remains an active area of research.