Circuit Folding: Concepts & Applications

Updated 2 April 2026

Circuit folding is a set of transformative techniques that replace spatial parallelism with temporal or structural reuse across quantum, classical, and manufacturing domains.
It enables noise amplification in quantum error mitigation, reduces hardware and sampling overhead, and streamlines modular workload management.
It also underpins optimizations in VLSI, FPGA, and flexible kirigami circuits, offering trade-offs between resource efficiency, throughput, and mechanical durability.

Circuit folding is a family of architectural, algorithmic, and compiler transformations that systematically replace spatial parallelism with temporal or structural reuse in both quantum and classical computing, as well as in manufacturing processes for flexible circuits. The core principle is to reorganize computations, physical layouts, or quantum operations so that repeated or redundant structures are identified and exploited—either by time-multiplexing (hardware folding), structural reuse (circuit knitting/folding), or by deliberately expanding the depth or size of a quantum circuit (noise amplification for error mitigation). Across domains, folding is used to trade off hardware resources, classical sampling effort, or error characteristics for desired computational or physical properties.

1. Fold Transformations in Quantum Error Mitigation

Circuit folding is a central technique in digital zero-noise extrapolation (ZNE) for quantum error mitigation, particularly in NISQ devices. The canonical folding operation modifies a quantum circuit $C$ into a noise-amplified form by inserting unitary identities, thus increasing the gate count while preserving the output distribution in the absence of noise. The mathematical archetype is the global unitary folding construction:

$U_{\text{folded}}(\lambda) = U\,\big(U^\dagger U\big)^{\lambda - 1}$

with folding factor $\lambda$ . For each $\lambda > 1$ , the circuit action remains $U$ , but experiences noise proportional to $\lambda$ if noise is gate-local and approximately Markovian.

Quantum circuit unoptimization, a generalization of folding, builds a family $\{R_i(C)\}_{i=0}^{k}$ where each $R_{i+1}$ is obtained by inserting a random two-qubit unitary $A$ and $A^\dagger$ (so $U_{\text{folded}}(\lambda) = U\,\big(U^\dagger U\big)^{\lambda - 1}$ 0) at various locations and then rewriting as

$U_{\text{folded}}(\lambda) = U\,\big(U^\dagger U\big)^{\lambda - 1}$ 1

This permits exponential scaling in the number of noise-amplified variants $U_{\text{folded}}(\lambda) = U\,\big(U^\dagger U\big)^{\lambda - 1}$ 2, where $U_{\text{folded}}(\lambda) = U\,\big(U^\dagger U\big)^{\lambda - 1}$ 3 is the number of eligible gate pairs and $U_{\text{folded}}(\lambda) = U\,\big(U^\dagger U\big)^{\lambda - 1}$ 4 samples the set of 4×4 unitaries. Circuit unoptimization yields exponentially many structurally distinct, compiler-resistant folded circuits, enabling robust noise averaging over variants—mitigating bias from spatially heterogeneous noise channels and adversarial compilation (Pelofske et al., 8 Mar 2025).

Each folded/noise-amplified circuit $U_{\text{folded}}(\lambda) = U\,\big(U^\dagger U\big)^{\lambda - 1}$ 5 with scale factor $U_{\text{folded}}(\lambda) = U\,\big(U^\dagger U\big)^{\lambda - 1}$ 6 yields observable data $U_{\text{folded}}(\lambda) = U\,\big(U^\dagger U\big)^{\lambda - 1}$ 7 for regression. Polynomial or Lagrange-extrapolated $U_{\text{folded}}(\lambda) = U\,\big(U^\dagger U\big)^{\lambda - 1}$ 8 recovers the zero-noise target.

Key empirical results: On 10-qubit Quantum Volume circuits, unoptimization-based ZNE with RMSE $U_{\text{folded}}(\lambda) = U\,\big(U^\dagger U\big)^{\lambda - 1}$ 9 recovers the heavy-output probability nearly to the ideal. For 12-qubit QAOA, quadratic regression recovers $\lambda$ 0 within 1–2 standard errors of ideal cost, substantially outperforming unmitigated runs (Pelofske et al., 8 Mar 2025).

2. Adaptive Folding and Filtering for Quantum Measurement Robustness

Adaptive circuit folding refines the above approach by selecting noise-amplification factors $\lambda$ 1 adaptively per-run based on real-time circuit-level error strength $\lambda$ 2. In practice, $\lambda$ 3 (empirical $\lambda$ 4), and $\lambda$ 5 values of $\lambda$ 6 are spaced exponentially between 1 and $\lambda$ 7. Folded circuits are generated using random or systematic allocation of identity insertions per two-qubit gate, with local or global folding schemes.

In addition to adaptive scaling, filtering removes measurement outliers arising from transient noise: (a) error-strength filtering discards runs outside two standard deviations of the mode error, measured by an inverted-circuit test; (b) statistical filtering applies a two-component Gaussian mixture to expectation values at each $\lambda$ 8, retaining only those in the primary component.

Empirically, adaptive folding plus filtering yields strong gains: for standard IBM Q hardware, RMSE reduction of up to 29.8% in error-mitigated expectation values (Grover, HHL, H-Ladder circuits) compared to fixed-scale ZNE (Koenig et al., 7 May 2025).

3. Circuit Folding in Modular Quantum Workload Management

In quantum-classical workload management, circuit folding targets a different axis of resource reduction: it exploits repeated module structure within quantum circuits to minimize total quantum and classical workload during circuit knitting. The CiFold system constructs a meta-graph $\lambda$ 9 where nodes are maximal repeated modules (gate sequences seen on multiple qubits), and edges encode temporal dependency or adjacency.

By folding (i.e., reusing) each module only once per unique module, quantum resource overhead (QRO) and classical sampling overhead for stitching are reduced:

$\lambda > 1$ 0

where $\lambda > 1$ 1 is module frequency. Empirically, CiFold achieves up to 799.2% reduction in QRO on 190-qubit adders ( $\lambda > 1$ 2), as well as substantial sampling and fidelity improvements across algorithms (e.g., QFT, GHZ, Bernstein-Vazirani) (Kan et al., 2024).

4. Folding in Classical Hardware: VLSI and FPGA Architectures

Circuit folding and closely related folding transformations are also foundational in classical VLSI and FPGA design. The principle is to replace a set of spatially distributed, logically independent processing elements with a single physical resource re-used over time (time-multiplexing).

In VLSI architectures for bipartite dataflow graphs derived from projective geometry, folding systematically overlays $\lambda > 1$ 3 logical compute/memory units onto $\lambda > 1$ 4 physical units, preserving topology using a fixed mapping (no runtime switch reconfiguration). Multi-tier pipelining (micro-architecture, bucket, and graph-level) recovers throughput lost to folding, yielding up to $\lambda > 1$ 5 area savings with $\lambda > 1$ 6 throughput penalty for LDPC decoders (Sharma et al., 2011).
In FPGAs, folding transformations based on subcircuits—not just individual operators—achieve resource reductions of up to 70% in total logic and DSP block usage. This involves identification of isomorphic subgraphs in the dataflow graph, assignment of time slots, and multiplexer/controller insertion. The cost-benefit is governed by $\lambda > 1$ 7, where $\lambda > 1$ 8 is the folding factor. Challenges include subgraph isomorphism (NP-hard) and optimal core selection, currently approached with user-guided or greedy heuristics (Möller et al., 2015).

5. Folding and Phase Folding in Quantum Compiler Optimization

In quantum compilers, phase folding (a specialization of circuit folding) merges phase gates $\lambda > 1$ 9 whose activation conditions coincide on all program paths. This reduces T-count in Clifford+ $U$ 0 circuits, the key cost metric for error-corrected devices. Formally, affine (linear algebraic) relations suffice for Clifford+T blocks, while non-linear (polynomial ideal) domains extend folding to non-Clifford gates and to quantum programs with general control flow, loops, and procedures.

The relational analysis perspective enables folding optimizations compositional on the structure of the program, and the sum-over-paths symbolic technique extracts rich constraints facilitating extended merges. Empirically, affine phase folding matches state-of-the-art (e.g., PyZX), and higher-order polynomial folding (with FastTODD) yields further T-count reduction (e.g., 28→20 for dirty ancilla Toffoli, 399→119 for 8-bit adder) (Amy et al., 2024).

6. Circuit Folding in Flexible and Kirigami Circuits

In physical fabrication, circuit folding describes the transformation of planar, copper-based substrates into three-dimensional, functional circuits via laser-engraved folds (kirigami). The Fibercuit process enables dual-layer and 3D flexible circuits by selectively cutting and forming copper–Kapton composites, with laser-induced temperature-gradient mechanisms precisely controlling fold loci and angles.

Mechanical and electrical performance is quantified by hinge geometry, per-pass laser recipe, and cycling durability. Example devices include kirigami cranes with functional electronics (eight hinges, $U$ 1, $U$ 2, $U$ 3 per hinge), flex cables, and electromagnetic coil-driven actuators. After $U$ 4 cycles at $U$ 5, trace resistance increases < $U$ 6, with cracks only emerging beyond $U$ 7 (Yan et al., 2022).

7. Limitations, Open Problems, and Comparative Perspective

Principal limitations are domain-dependent:

Quantum circuit folding for ZNE: depth/scheduling interference by backend compilers, non-commutativity with pulse-level stretching, and bias from local error structure. Averaging over many structurally distinct folded variants partially mitigates these artifacts (Pelofske et al., 8 Mar 2025).
Modular folding in workload management: reliance on exact sequence matching (limiting for variational/randomized circuits), with current work exploring fuzzy/parameterized folding (Kan et al., 2024).
Hardware folding: subcircuit selection is intractable for arbitrary graphs; full automation remains unresolved (Möller et al., 2015).
In kirigami fabrication: mechanical fatigue at extreme angles and variability in multi-material boards remain engineering challenges (Yan et al., 2022).

Across all domains, circuit folding provides a rigorous means for structural and temporal reuse or redundancy, enabling new tradeoffs in noise robustness, quantum resource scaling, classical hardware area, or form-factor design. The technique admits continual refinement as analysis tools, quantum hardware, and fabrication technologies mature.