Automated Circuit Discovery (ACDC)
- Automated Circuit Discovery (ACDC) is a suite of computational methods that identifies task-relevant, sparse subnetworks within complex systems to improve interpretability and design efficiency.
- Techniques such as activation patching, mixed-precision optimization, and edge attribution patching balance faithfulness with computational efficiency by selectively pruning non-essential components.
- ACDC methodologies have been validated across diverse domains—including neural, analog, quantum, and photonic circuits—achieving robust, scalable, and verified performance.
Automated Circuit Discovery (ACDC) refers to a suite of computational methodologies aimed at identifying sparse, task-relevant subnetworks—“circuits”—within larger models or physical systems, such as neural networks, electronic hardware, or quantum devices. ACDC fulfills a dual role: it provides a mechanistic deconstruction of complex computations in AI and physical circuits, and it serves as an automation substrate for design, verification, and analysis tasks where exhaustive human search is infeasible due to scale or complexity. The field encompasses both formal algorithmic criteria and high-performance system implementations, with extensive benchmark validation across both software-based and hardware-based domains.
1. Foundations: Formal Setting and Objectives
ACDC is grounded in viewing the model or system under study as a directed acyclic computational graph , where nodes represent computational units (neurons, attention heads, circuit components) and edges represent data or signal flow. The central goal is to identify subgraphs (circuits) such that the restricted function closely replicates the full behavior or output of the original graph for a given set of tasks or behaviors. Two main desiderata govern the discovery process:
- Faithfulness: The circuit’s outputs should closely match those of the original system, either on a discrete set of test cases or over a defined continuous input domain.
- Sparsity/Succinctness: The circuit should be minimal in some sense—number of nodes, edges, or components—facilitating interpretability, deployability, and cost-effectiveness.
This task becomes especially demanding for large neural networks and complex physical circuits, where interpretability, robustness, and computational tractability collide (Conmy et al., 2023, Wang et al., 27 Oct 2025, Hadad et al., 18 Feb 2026).
2. Algorithmic Paradigms in ACDC
2.1 Activation Patching and Recursive Pruning
The classical approach to ACDC in neural models traces to recursive activation patching. For each edge , importance is measured by the causal impact of “patching” it with a value from a corrupted or alternative input:
where is a task-dependent metric and denotes model execution with edge patched. Edges are recursively or greedily pruned based on their effect size, with a hyperparameter controlling the faithfulness-succinctness tradeoff (Conmy et al., 2023).
Limitations
- Scalability: Each patching operation typically requires an expensive forward pass, leading to 0 cost per circuit discovery.
- Approximation Sensitivity: Pruning may inadvertently remove “negative” or context-sensitive components without fine-tuned thresholds.
2.2 Mixed-Precision Optimization: PAHQ
To mitigate the prohibitive cost of patching, Per-Attention-Head Quantization (PAHQ) exploits a key alignment: only the edge under current test needs to be evaluated at high numerical precision, with all other edges/heads quantized to lower precision. During circuit discovery in transformers, this is realized by dynamically instantiating 32-bit weights for the head under investigation and keeping the remainder at 8-bit:
1
2
This yields up to 80% runtime reduction and 30% memory savings with negligible loss in faithfulness (AUC-ROC within 2% of full-precision) (Wang et al., 27 Oct 2025).
2.3 Contextual Decomposition and Analytic Attribution
Contextual Decomposition for Transformers (CD-T) analytically splits activations into “relevant” and “irrelevant” parts, propagating such decompositions through each module (residual sums, nonlinearities, self-attention, MLPs). The contribution metric for a source unit 3 to set of receivers 4:
5
where 6 is the relevant component at the output of a module. Recursive search and pruning yield interpretable circuits with higher faithfulness and twofold speedup versus activation-patching baselines (Hsu et al., 2024).
2.4 Edge Attribution Patching
Edge Attribution Patching (EAP) replaces the patching forward-pass with a linear Taylor expansion, requiring only two forwards and one backward for all edges:
7
8
This enables much greater scalability, and empirical studies show improved or matched circuit recovery AUCs versus activation patching (Syed et al., 2023).
2.5 Formally-Verified Circuit Discovery
Recent work introduces ACDC with provable guarantees, leveraging neural network verifiers (e.g., α-CROWN) to obtain circuits with certified input robustness and patching-robustness over continuous domains:
- Input-Domain Robustness: For all 9 in a continuous ball 0, the circuit output matches the full network within some tolerance 1.
- Patching Robustness: The circuit is resilient to patching interventions on non-circuit nodes, guaranteeing faithful behavior for all valid activations.
Multiple minimality frameworks—subset-minimal, cardinal-minimal—are supported, with blocking-set/hitting-set duality yielding near-optimal circuits (Hadad et al., 18 Feb 2026).
3. Domain-Specific ACDC: Hardware, Analog, Quantum, Photonic
ACDC methodologies adapt across physical and engineered circuit domains:
3.1 Analog and Power Circuits
- AnalogFed: A federated, modular transformer architecture for analog circuit topology generation, with privacy-preserving distributed training and subgraph-mining tokenization. Centralized and federated models achieve >95% validity and >99% novelty (Li et al., 20 Jul 2025).
- PowerGenie: Integrates analytical equivalence checking (graph-based KVL/KCL, VCR extraction) and evolutionary GPT-based population optimization for reconfigurable power converters, efficiently navigating the design space and outperforming alignment baselines in syntax, functional validity, novelty, and figure-of-merit (Gao et al., 29 Jan 2026).
3.2 Quantum and Superconducting
- Multi-Objective Evolution (Quantum ACDC): Variable-length genome encodings, elitist NSGA sorting, and non-convex multi-objective optimization over gate error, size, and hardware cost produce optimal and novel quantum circuits. Applications include rediscovering textbook QFT, Grover’s search, and finding divide-and-conquer variants (Potoček et al., 2018).
- SCILLA (Superconducting Circuits): Graph-based encoding of lumped linear/nonlinear elements, Hamiltonian-based objective functions, parallel design-explore loops, and evolutionary refinement discover 4-local superconducting qubit couplers surpassing prior designs in both coupling and noise resilience (Menke et al., 2019).
- Photonic Graph State Circuits: FFT-based polynomial simulation, gradient optimization in SU(m) space, and a sparsification regularizer yield minimal heralded photonic graph state circuits with rational beamsplitter ratios; up to 7.5× higher success vs. fusion-based baselines (Hartnett et al., 22 Aug 2025).
4. Advances in Granularity, Position-Sensitivity, and Input Adaptation
Recent approaches extend ACDC to finer-grained and more data-aligned circuits.
- Hierarchical Attribution via Linear Computation Graphs: By re-implementing MLP and OV circuits as strictly linear graphs (via sparse autoencoders and “Transcoder” modules), exact end-to-end and local circuit attributions become tractable at a massive feature scale, exposing polysemantic head features and their context-specific activations (Ge et al., 2024).
- Position-Aware Edge Attribution and Dataset Schemas: Circuits are made sensitive to token position and task semantics through position-aware edge attribution patching (PEAP) and schema-based token grouping, vastly reducing circuit size while maintaining faithfulness, even under variable-length inputs (Haklay et al., 7 Feb 2025).
5. Quantitative Evaluation and Comparative Performance
ACDC methods are benchmarked on both standard mechanistic interpretability tasks in NLP (IOI, Greater-Than, Docstring) and on domain-specific analog, quantum, and photonic challenges.
| Method | Circuit Size | Faithfulness (AUC/Recovery) | Runtime/Memory | Domain |
|---|---|---|---|---|
| ACDC (orig., FP32) | 68/262 | 0.98/0.91/0.95 (ROC AUC) | 99 min, 6.23 GB | Transformers |
| PAHQ-ACDC | 68 | ~0.96/0.87/0.89 | 20 min, 4.24 GB | Transformers |
| CD-T | ~30 heads | 97% ROC AUC | ~2× faster than patching | Transformers |
| EAP | ~50–100 edges | 0.90 (Greater-Than AUC) | Orders faster than ACDC | Transformers |
| AnalogFed (FedAvg) | n/a | 95.0% valid, 99.6% novel | SOTA FoM, privacy-secure | Analog circuits |
| PowerGenie | n/a | +23% FoM over training best | 10M params, GPT-based | Power converters |
| Quantum Evo. | ~15–30 gates | E<10⁻³ perfect circuits | ~1,000–10,000 generations | Quantum circuits |
6. Limitations, Complexity, and Theoretical Guarantees
- Scalability: Techniques such as mixed-precision inference, analytical decomposition, and linear approximation are essential for scaling to LLM and hardware circuit complexity (Wang et al., 27 Oct 2025, Syed et al., 2023).
- Provable Robustness: Sample-based ACDC can break under adversarial or continuous input perturbations; certified verifiers guarantee robustness but at high computational cost (Hadad et al., 18 Feb 2026).
- Minimality Notions: Subset-, local-, cardinal-minimality trade off computational tractability and guarantee strength. Combinatorial approaches (hitting set duality) provide practical lower bounds and, for small circuits, provable minimization (Hadad et al., 18 Feb 2026).
7. Future Directions and Broad Applicability
Open problems include scaling formal verification to large networks, developing richer data- and task-adaptive discovery (e.g., for multi-modal, attention-based architectures), and integrating human-intelligible summarization for discovered circuits. Methods such as federated learning (AnalogFed), evolutionary finetuning (PowerGenie), and deep analytic-simulation hybrids (photonics, quantum) illustrate domain generality for system-scale automated discovery (Li et al., 20 Jul 2025, Gao et al., 29 Jan 2026, Hartnett et al., 22 Aug 2025).
ACDC now encapsulates an expanding methodological toolbox, unifying aspects of mechanistic interpretability, high-performance design automation, and formal verification across AI and hardware. Rigorous benchmarks demonstrate that modern techniques preserve or surpass interpretability fidelity while rendering circuit analysis feasible at previously intractable scales.