Papers
Topics
Authors
Recent
Search
2000 character limit reached

Optimized Point Addition Circuits for Elliptic Curve Discrete Logarithms

Published 1 Jun 2026 in quant-ph | (2606.02235v1)

Abstract: Shor's algorithm represents the main threat of quantum computers to cryptography. In order to precisely understand its feasibility, many authors have worked towards reducing its costs, either at the logical level (assuming a fault-tolerant architecture), or at the physical level (taking into account the constraints of envisioned hardware). In particular, recent works by Chevignard et al. (CRYPTO 2024) and Gidney (arXiv 2025) used improved arithmetic to significantly reduce the qubit cost of factoring RSA public keys. Even more recently, Babbush et al. (arXiv 2026) improved the cost of computing elliptic curve discrete logarithms, with a reduction of a factor 2 to 3 in gate count and qubit count compared to a previous work by Litinski (arXiv 2023). Their result relies on optimized point addition circuits on elliptic curves over prime fields. However they did not reveal their logical quantum circuits, relying instead on a zero-knowledge proof. In this paper, we detail a quantum logical circuit architecture which gives similar results as Babbush et al., with a slightly higher number of qubits (around 1.5% increase) and a slightly smaller Toffoli gate count (between 6.5% and 10% reduction) for the curve secp256k1. We also give gate counts for a generic variant of the circuit, which is valid for any prime field.

Authors (1)

Summary

  • The paper introduces explicit quantum circuit designs for elliptic curve point addition that lower Toffoli counts and qubit resources in Shor’s algorithm.
  • It employs a hybrid technique combining CDKM and Gidney arithmetic with a novel Euclidean algorithm compression to optimize modular operations.
  • Empirical results demonstrate up to a 10% reduction in Toffoli gates for secp256k1, setting new benchmarks for quantum cryptanalysis.

Optimized Quantum Circuits for Elliptic Curve Discrete Logarithms

Abstract and Context

The paper "Optimized Point Addition Circuits for Elliptic Curve Discrete Logarithms" (2606.02235) critically advances quantum circuit optimization for the discrete logarithm problem (ECDLP) on elliptic curves, specifically addressing resource estimation for Shor's algorithm in the context of post-quantum cryptanalysis. Building directly on recent breakthroughs that dramatically reduced the quantum cost for DLP and factoring (e.g., Babbush et al., arXiv 2026), the work provides explicit, fully-specified logical quantum circuits for elliptic curve point addition—including both generic field primes and the secp256k1 curve—yielding both competitive and in certain regimes improved gate and qubit counts versus previous bests.

Technical Contributions

Circuit Architecture and Optimization Principles

The core of the paper is the construction and empirical validation of reversible logic circuits for elliptic curve point addition in affine coordinates. The design is rooted in the standard windowed approach for group operation parallelization in Shor’s ECDLP quantum algorithm. As with prior works, the major circuit bottleneck is in-place modular multiplication, which is required multiple times in each point addition.

The construction distinguishes itself in several ways:

  • Explicit Circuit Disclosure: Unlike preceding works that validated claims with zero-knowledge proofs and omitted circuit-level descriptions (cf. Babbush et al.), the present work provides complete, reproducible circuit methodologies.
  • Resource Trade-off: By careful selection between arithmetic circuit families (CDKM, Gidney, and hybrids), optimizations are made for both gate and space (qubit) efficiency, exploiting available ancillas whenever possible.
  • Optimized Modular Arithmetic: For curves like secp256k1, leveraging the pseudo-Mersenne prime structure enables further simplifying modular reduction steps, particularly within doubling and modular addition.

Compression of Euclidean Algorithm State

A novel element is the efficient encoding of the step history ("dialog") of the binary extended Euclidean algorithm (EEA) during modular inversion, exploiting the observation that the computation can be split: first generating a "garbage" bit-vector detailing loop operations, then reconstructing Bézout coefficients in a secondary pass. A custom compression circuit maps 3 pairs of iteration bits into 5 bits, substantially reducing ancilla overhead per modular multiplication. Figure 1

Figure 1: "Compression" circuit mapping 3 pairs (b0,b0 content b1)(b_0, b_0 \text{ content } b_1) into 5 bits; the last bit is always zero on valid input, optimizing qubit reuse.

Performance Characteristics

Empirical estimates, implemented and validated in the Qarton library, demonstrate concrete resource bounds:

Circuit Variant Qubits Toffolis (exponent log2) Curve Type
Space-opt. 1192 21.19 secp256k1
Gate-opt. 1446 20.83 secp256k1
Space-opt. 1192 21.78 Generic prime
Gate-opt. 1446 21.42 Generic prime

For full Shor's algorithm instantiations on secp256k1, this results in 1208–1462 logical qubits and Toffoli gate counts in the 225.82^{25.8}226.12^{26.1} region, surpassing previous work by up to 10% reduction in Toffolis and a slight (≤1.5%) increase in total qubits.

The success probability on random inputs is experimentally established, reaching at least 1213.31 - 2^{-13.3} for relevant parameter choices.

Practical and Theoretical Implications

Post-Quantum Cryptanalysis

The immediate implication is a sharpened quantum resource estimate for attacking standardized elliptic curves, notably secp256k1 as used in Bitcoin and many other cryptocurrencies. The circuits' low Toffoli count and manageable qubit requirements provide precise targets for quantum hardware designers and policymakers assessing realistic post-quantum risk timelines.

Benchmark for Quantum Circuit Design

The explicit gate-level descriptions, modular structure, and the blend of theory with practical testing set a new reproducibility standard for cryptanalysis circuits. The modular arithmetic routine optimizations, particularly the use of hybrid Gidney/CDKM adders and specialized handling of pseudo-Mersenne reductions, are directly relevant for a wide range of quantum algorithms beyond ECDLP.

Impact on Algorithm Engineering

The work sharpens the trade-off frontier between qubits and gates, informing new circuits for modular inversion, multiplication, and addition. The compression of Euclidean operations history, in particular, may yield further efficiency gains in register-limited environments or inform new layouts in quantum memory management.

Forward-Looking Speculation

With explicit circuit designs published, further reductions can be sought:

  • Adaptation to Physical Constraints: Translating logical circuits into error-corrected qubit layouts, i.e., optimizing for architectures with topological constraints or measurement-based quantum computing.
  • Composable Circuit Libraries: The arithmetic and modular routines presented could accelerate the assembly of more complex quantum cryptanalysis methods, and may benefit post-quantum signature and encryption scheme analysis (e.g., quantum attacks on isogeny-based cryptosystems).
  • Parameter Tuning for Non-Standard Curves: With detailed resource scaling for generic primes, the framework is adaptable to custom curve parameters, informing cryptographic agility in the event of targeted quantum cryptanalysis.

Conclusion

The paper provides an explicit path from abstract resource estimates to reproducible, gate-level quantum circuits for elliptic curve DLP, establishing new benchmarks for both qubit and gate efficiency in the logical model. These contributions advance the field's understanding of the concrete cryptanalytic threat quantum computers pose to widely deployed elliptic curve protocols, and point to immediate follow-up research in optimized quantum arithmetic and cryptanalytic algorithm engineering.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

Explain it Like I'm 14

What is this paper about?

This paper is about making a key part of Shor’s quantum algorithm faster and smaller when attacking elliptic-curve cryptography (the kind used by Bitcoin’s curve, secp256k1). The authors design and share the details of quantum “circuits” (step‑by‑step instructions for a quantum computer) that perform elliptic‑curve point addition more efficiently. This matters because point addition is the main building block for solving the elliptic‑curve discrete logarithm problem (the “hard problem” that keeps many cryptocurrencies secure).

What questions are the authors trying to answer?

In simple terms:

  • How can we design a point addition circuit for elliptic curves that uses fewer operations (“gates”) and not too many memory slots (“qubits”) on a quantum computer?
  • Can we match or beat recent improvements by other researchers while publishing the full circuit so others can reproduce and study it?
  • Can these tricks work not only for Bitcoin’s curve (which has a special prime) but also for any prime-based elliptic curve?

How do they approach the problem?

Think of the whole task like building a faster recipe for adding special kinds of points on a “curvy” number line (an elliptic curve). Shor’s algorithm needs to do this point addition many times, so shaving time and memory off the recipe matters a lot.

Here’s the plan in everyday language:

1) Focus on what really costs time

  • Shor’s algorithm boils down to repeating “windowed point additions,” where a small index picks a precomputed point and adds it to a running total.
  • The slowest parts are big‑number operations done “modulo” a prime (think of numbers that wrap around after a certain huge value, like a clock wraps from 12 back to 1).

2) Make the hardest step (in‑place modular multiplication) cheaper

  • A key step is multiplying numbers “in place” (turn y into y×x modulo q, without extra output space). Traditionally, you’d first compute x⁻¹ modulo q (an “inverse”) and then combine it, which is expensive.
  • The authors use a clever two‑phase trick:
    • Phase A (Euclidean algorithm as a “recipe maker”): They run a version of the Euclidean algorithm (a classic way to compute greatest common divisors) on two inputs but, instead of carrying lots of extra numbers, they just record a compact “recipe” of simple steps (like “swap,” “subtract,” “halve”) as a bit string. You can think of this bit string as a compressed set of instructions needed later.
    • Phase B (Bézout reconstruction as a “recipe follower”): They replay those recorded steps on a different pair of numbers to directly transform y into y×x modulo q. This lets them do the multiply “in place” with fewer heavy operations.

Analogy: You watch a chef cook a dish and write down only the key steps (not the intermediate mess). Later, you follow those steps to recreate the result quickly without making all the same mess again.

3) Store the “recipe” efficiently

  • Each loop of the Euclidean algorithm writes just a couple of bits. The authors compress groups of these bits so the whole “recipe” fits into about 2.35n bits for an n‑bit prime (plus a small cushion), which is quite tight.
  • They also reuse cleared parts of registers to store these bits, keeping the total qubit count low.

4) Use approximate checks when exactness isn’t needed

  • Shor’s algorithm only needs the addition circuit to succeed with a high constant probability (not 100% every time). So the authors speed things up by:
    • Comparing only the top few bits (the “most significant bits”) instead of full n‑bit numbers in some places. This saves work but introduces a tiny error chance.
    • Taking advantage of the special shape of Bitcoin’s prime (a “pseudo‑Mersenne” number of the form 2u − f where f is small). With such primes, reducing numbers “mod q” can be replaced by much cheaper steps (like adding a small f into only the low bits), which cuts many gates.

5) Build, test, and share

  • They implemented everything using a Python library (Qarton) and tested the circuits on many random inputs.
  • The circuits are not exact, but they measured a very low failure rate (about 1 in 10,000 or less), which is acceptable for Shor’s algorithm.

What did they find, and why is it important?

Big picture:

  • They match and in some ways improve the best publicly known logical (idealized) quantum circuits for elliptic‑curve discrete logs on secp256k1.
  • Compared to a recent (non‑disclosed) design by Babbush et al., they:
    • Use about 1.5% more qubits.
    • Use about 6.5% to 10% fewer Toffoli‑type gates (the most “expensive” kind of quantum operation to implement fault‑tolerantly).
  • Their full circuits are published and reproducible. That transparency is valuable for the community to verify, build on, and map to real hardware assumptions.

Concrete numbers (rounded for intuition):

  • Running Shor’s algorithm once to break one secp256k1 key would need on the order of 58–69 million Toffoli‑type gates with about 1,200–1,460 logical qubits, depending on the chosen space‑vs‑speed trade‑off.
  • This improves earlier public estimates (like a ~200 million Toffoli count from 2023 work) while using far fewer qubits than those older designs.
  • For curves over “general” primes (not Bitcoin’s special prime), their circuits still work but cost more gates, since they can’t use the same shortcut reductions.

Why it matters:

  • These are among the most detailed, efficient, and openly described circuits for attacking elliptic‑curve cryptography with Shor’s algorithm.
  • They sharpen our understanding of the true quantum resources required to threaten widely used systems (e.g., cryptocurrencies), informing realistic timelines and defenses.

What could this mean going forward?

  • Better risk estimates: With more precise gate and qubit counts, organizations can better judge when quantum attacks might become practical and plan migrations to post‑quantum cryptography.
  • Reproducibility and progress: Publishing the full circuit designs helps others validate, improve, and adapt them, speeding up research on both offense (to understand risks) and defense (to design safer systems).
  • Reusable techniques:
    • Splitting the Euclidean algorithm into a “recipe maker” and a “recipe follower” could help optimize other quantum arithmetic tasks.
    • Using approximate checks (top‑bit comparisons) and exploiting special prime shapes are general tricks that can reduce costs in many quantum cryptanalysis circuits.

In short, this paper gives a clear, open, and slightly improved path to building fast point‑addition circuits for Shor’s algorithm on elliptic curves—especially the one that secures Bitcoin—bringing us closer to accurate, end‑to‑end estimates of quantum attack costs.

Knowledge Gaps

Knowledge gaps, limitations, and open questions

Below is a focused list of what remains missing, uncertain, or unexplored in the paper, phrased to guide concrete follow‐up research.

  • Formal failure analysis of approximate arithmetic:
    • Derive rigorous bounds on the failure probability of MSB‑only comparisons (for u>v and modular reductions) as a function of the chosen number of MSBs and padding parameters (the “≈40–50 MSBs,” α≈2.3, β≈2.4).
    • Provide tail bounds (not just normal-fit heuristics) for the Euclidean iteration count and garbage length to guarantee target success probabilities.
  • End-to-end success probability accounting:
    • Quantify how per-component failure probabilities compose across all subroutines and the full Shor run (including 28 windowed additions, three lookups per addition, and all modular ops), and determine repeat/run‑again overhead or the need for amplitude amplification.
  • Worst-case vs. average-case correctness:
    • Analyze and, if needed, mitigate failure modes under adversarial or worst-case inputs rather than only random inputs (e.g., pathological EEA paths or edge cases near modulus boundaries).
    • Specify detection/repair mechanisms or exact fallbacks for rare events beyond i=0 (e.g., accidental doublings, R+P_i producing exceptional cases) and quantify their overhead and probabilities.
  • Phase correctness and verification:
    • Provide a formal argument or tool-backed verification that measurement-based uncomputation and “kickmix” choices do not introduce harmful relative phases across superpositions relevant to Shor’s algorithm.
    • Move beyond functional (basis-state) testing to phase-aware equivalence checking or symbolic verification.
  • Simulation fidelity:
    • Replace or augment the current gate-level simulation proxy (which swaps arithmetic subcircuits for classical functions) with phase-accurate or gate-accurate verification to catch quantum-specific effects.
  • Circuit depth metrics:
    • Report Clifford depth, T-count, and especially T-depth (or Toffoli depth) and overall latency/measurement depth, not just total Toffoli counts, to enable realistic wall-clock projections under fault-tolerance.
  • Fault-tolerant mapping assumptions:
    • Analyze mid-circuit measurement latency, classical feed-forward requirements, and their impact on surface-code cycle counts.
    • Provide T‑factory sizing and schedule estimates given the Toffoli/AND gates and chosen decompositions (e.g., relative-phase Toffoli vs. standard), and compute corresponding T counts and T-depth.
  • QROM lookups and memory assumptions:
    • Detail physical costs and constraints of window table lookups (2w Toffolis nominally), including layout, routing, and error-corrected memory assumptions.
    • Explore alternatives to standard QROM (hashed select-and-add, streaming decompositions, compressed tables) and quantify their impact on qubits and depth.
  • Window-size optimization:
    • Re-optimize the window size w under the new point-addition costs (rather than adopting w=16), jointly minimizing total Toffolis, T-depth, and qubits for the full algorithm.
  • Generality across curves and primes:
    • Provide resource estimates and empirical validation for other primes and curves beyond secp256k1 (e.g., NIST P‑256, ed25519 prime), and characterize how special prime structures beyond pseudo‑Mersenne (e.g., generalized Mersenne, Solinas) alter the reductions.
    • Quantify how the “any-prime” circuit scales with n, and benchmark on multiple n to validate the asymptotic constants.
  • Coordinate-system trade-offs:
    • Systematically compare the presented affine/EEA-based approach with projective/Jacobian coordinates under similar approximate arithmetic to determine whether a different coordinate choice yields better qubit/gate/depth trade-offs.
  • Montgomery vs. standard representation:
    • Although this work avoids Montgomery form, run a controlled comparison (including approximate reductions) to identify regimes where Montgomery may still be superior in gate or depth, especially within the Bèzout reconstruction loop.
  • Algorithmic accelerations:
    • Investigate combining these circuits with curve-specific accelerations such as GLV endomorphisms on secp256k1 or improved multi-scalar multiplication schedules (e.g., joint sparse form, NAF variants) to reduce the number of additions/windows.
  • Dialog/garbage encoding design space:
    • Analyze optimal block sizes and compressor designs beyond “3 pairs → 5 bits,” proving minimal Toffoli/ancilla requirements and quantifying the gate/space trade-offs for alternative encodings.
    • Provide proofs of correctness for the compressor on all valid inputs and characterize behavior/correctness under malformed sequences.
  • Robust handling of the x+y=q corner case:
    • The pseudo‑Mersenne controlled addition includes a special variant for x+y=q only during early iterations; evaluate whether a unified, low-overhead exact handler is possible without sacrificing success probability elsewhere.
  • Parameter selection methodology:
    • Replace ad‑hoc parameter choices (e.g., numbers of MSBs/LSBs used in reductions, padding constants) with an optimization framework that jointly targets resource counts and end-to-end success probability constraints.
  • Parallelism and scheduling:
    • Explore pipeline/overlap opportunities between “convert x to garbage,” Bèzout reconstruction, and “convert back,” as well as parallelism among different windows, to reduce circuit depth.
  • Comprehensive resource–error trade-off curves:
    • Go beyond the two presented points (space‑optimized vs. gate‑optimized) to map out the full Pareto frontier, including ancilla budgets, compressor variants, and adder choices, to support architecture-specific co‑design.
  • Noise and logical error interaction:
    • Study how approximate arithmetic failures interact with stochastic logical errors under error-correction, and model the expected number of reruns or syndrome cycles needed to achieve a fixed success probability.
  • Detailed subcircuit counts:
    • Provide explicit Toffoli/T/Clifford counts per subcomponent (e.g., each modular add/double, each comparator) and their decomposition choices to enable independent reproduction and cross-checks.
  • Reproducibility and benchmarks:
    • Supply scripts, seeds, and instructions to reproduce the 10,000‑input success experiments and all reported counts; include tests across multiple curves and bit-sizes with raw logs for independent verification.
  • Physical-layout overheads:
    • Estimate swap/routing overheads for 2D nearest-neighbor layouts, especially for QROM and wide adders, and quantify how they alter depth and qubit counts under realistic connectivity constraints.
  • Modular squaring improvements:
    • The modular squaring accounts for ~9–10% of CCX cost; investigate whether specialized squaring circuits (exploiting structure in x2 mod q) or Karatsuba/Toom-inspired techniques reduce this further under the same ancilla budget.
  • Beyond prime fields:
    • Assess extendability to binary fields or extension fields and identify which components (EEA dialog, modular reduction tricks) break or require redesign.

Practical Applications

Immediate Applications

The following applications can be deployed now for benchmarking, planning, education, and software/tooling, even though real-world quantum attacks remain dependent on future fault-tolerant hardware.

  • Resource estimation and benchmarking for quantum platforms (quantum hardware/software)
    • Use the open circuits and counts to calibrate compilers, schedulers, Toffoli factory sizing, and memory layouts for kickmix/measurement-based uncomputation circuits.
    • Benchmark suites: integrate the secp256k1 and “any prime” point addition/multiplication circuits into performance dashboards to compare backends and fault-tolerance stacks.
    • Assumptions/dependencies: logical (not physical) counts; Toffoli + AND as the non-Clifford metric; measurement-based uncomputation; success probability tuned via MSB windows.
  • Crypto risk quantification and migration planning (finance, cryptocurrency, cybersecurity, policy)
    • Update quantum risk models with concrete logical resources for ECC: e.g., ~1462 logical qubits and ~225.78 Toffolis to solve a single secp256k1 discrete log (gate-optimized variant), with 28 windowed additions and w=16.
    • Prioritize de-risking strategies: reduce address reuse, accelerate PQC migration plans (hybrid signatures, dual-certificates), and update board/counterparty risk dashboards.
    • Assumptions/dependencies: translation from logical to physical resources (error-correction overheads), throughput assumptions (surface code cycle times, factory rates).
  • Standards and policy support (policy, standards bodies, regulators)
    • Ground PQC migration timelines and crypto-agility mandates in reproducible, peer-reviewable resource estimates; provide evidence for freezing new ECC deployments in long-lived systems.
    • Inform updates to protocol profiles and procurement guidance (e.g., mandate hybrid PQC for new systems, sunset timelines for ECC-only stacks).
    • Assumptions/dependencies: policy uses do not depend on near-term fault tolerance; estimates rely on windowed Shor’s implementation and success probabilities reported.
  • Open, auditable reference circuits for academia and tooling (academia, software)
    • Adopt the Qarton implementation as a reproducible baseline for ECDLP circuits; enable independent verification and head-to-head comparisons with zero-knowledge–only claims.
    • Tooling: package in circuit libraries (e.g., as an “in-place modular multiplication via EEA dialog + Bézout reconstruction” primitive) for reuse across cryptoanalysis pipelines.
    • Assumptions/dependencies: MSB-only comparisons and pseudo-Mersenne shortcuts are optional knobs; generic-prime variant provided.
  • Compiler and circuit-optimization research (academia, software)
    • Incorporate the split-EEA + Bézout reconstruction pattern to cut space in in-place modular operations; study success-probability–aware passes (e.g., MSB-only compares).
    • Evaluate hybrid adders (Gidney/CDKM) under ancilla scarcity; exploit dirty-ancilla constant adders where possible; expose “kickmix-friendly” templates to compilers.
    • Assumptions/dependencies: approximate arithmetic is acceptable when success probability remains high for random inputs.
  • Blockchain exposure analytics and audit services (cryptocurrency, cybersecurity)
    • Extend chain-scanning tools to quantify funds vulnerable to a quantum ECDLP attack (e.g., UTXOs with revealed pubkeys, reused addresses) and track exposure trends.
    • Produce “quantum risk heatmaps” per asset and address class; feed results into exchange/ custodian policy and incident playbooks.
    • Assumptions/dependencies: purely classical analysis; uses resource estimates for risk scoring.
  • Education and workforce training (education, academia, industry upskilling)
    • Develop course modules and labs on quantum ECC attacks using the provided Qarton code and kickmix circuits; contrast exact vs approximate modular arithmetic.
    • Use these circuits to teach reproducibility and cost accounting (Toffolis/qubits) in quantum cryptanalysis.
    • Assumptions/dependencies: none beyond simulator availability.
  • Product planning for HSMs and wallets (security products, cryptocurrency)
    • Plan PQC algorithm support and hybrid signing workflows; add “quantum safety” posture indicators (e.g., alert on address reuse or revealed pubkeys).
    • Draft operational guidance: key rotation cadence, use of outputs that minimize early public key exposure, and staged migration to PQC-ready scripts as standards mature.
    • Assumptions/dependencies: PQC algorithms/hybrids must be standardized and interoperable; wallet UX changes are feasible without protocol forks.

Long-Term Applications

These rely on fault-tolerant quantum computers at scale or on broader ecosystem changes (standards, protocols, hardware co-design) and thus require further research, scaling, or development.

  • Practical quantum ECDLP attack pipelines (cryptocurrency, cybersecurity, law enforcement red teaming)
    • End-to-end workflow: target selection (exposed pubkeys), precomputation of windows, fault-tolerant scheduling of 28 windowed additions, magic-state/distillation planning, and real-time mempool attacks on pending transactions.
    • Services: authorized red-team assessments of ECC systems, with strict legal/ethical controls.
    • Assumptions/dependencies: millions-to-billions of physical qubits with low logical error rates; high-throughput Toffoli factories; reliable runtime and latency guarantees.
  • Protocol and ecosystem transition away from ECC-only (internet infrastructure, payments, identity)
    • Large-scale deprecation of ECC in TLS, SSH, cryptocurrencies, PKI, IoT, and hardware tokens; broad adoption of hybrid and then pure-PQC schemes.
    • Hardware refreshes: PQC-capable cards, HSMs, and secure enclaves; crypto-agile firmware and remote update infrastructure.
    • Assumptions/dependencies: mature PQC standards and implementations; performance acceptable for edge/embedded; regulatory mandates and migration funding.
  • Quantum compiler advances: success-probability–aware optimization passes (software, quantum compilers)
    • Automated selection of MSB windows, pseudo-Mersenne shortcuts, and hybrid adder choices under ancilla constraints to meet target success probability and cost budgets.
    • Verified lowering of kickmix templates and measurement-based uncomputation patterns.
    • Assumptions/dependencies: robust cost models for FTQC backends; integration with error-correction–aware schedulers.
  • Hardware–algorithm co-design for modular arithmetic (quantum hardware, EDA)
    • Architectures tuned to frequent constant additions, controlled swaps, and low-ancilla adders; layout/scheduling that minimizes distillation bottlenecks for Toffoli-heavy crypto circuits.
    • Potential specialized modules or instruction sets for modular doubling/addition against pseudo-Mersenne primes.
    • Assumptions/dependencies: stable hardware roadmaps; ability to influence ISA and microarchitecture.
  • Standardized, verifiable reporting of quantum cryptanalysis resources (policy, academia, industry consortia)
    • Repositories of open circuits plus zero-knowledge attestations of gate/qubit counts and success probabilities; common benchmarks for ECC curves (secp256k1, P-256, etc.).
    • Audit frameworks to compare logical-to-physical cost projections across architectures.
    • Assumptions/dependencies: community consensus on metrics and reporting; trusted tooling for circuit verification.
  • Generalization to other cryptanalytic targets (cryptography research)
    • Port the split-EEA/Bézout method and approximate modular arithmetic techniques to finite-field DL, pairings, class groups, and other number-theoretic primitives.
    • Explore windowed-addition optimizations for non–pseudo-Mersenne moduli with alternative reduction tricks.
    • Assumptions/dependencies: algebraic adaptations; renewed success-probability analyses per domain.
  • Financial risk transfer and cyber insurance products (finance, insurance)
    • Actuarial models for quantum-theft scenarios against ECC-protected assets; coverage products with triggers linked to QC capability thresholds and mitigation posture.
    • Assumptions/dependencies: credible industry consensus on attack timelines; regulatory acceptance.
  • End-user protections embedded in wallets and services (daily life, fintech)
    • Default safeguards that block risky behaviors (e.g., address reuse), automatic migration prompts to PQC/hybrid accounts, and “quantum readiness” scores surfaced to users.
    • Assumptions/dependencies: protocol support for PQC/hybrid scripts; broad wallet/vendor adoption; user education.

Notes on cross-cutting assumptions and dependencies:

  • The paper’s performance relies on windowed Shor’s algorithm, kickmix-compatible circuits, MSB-only comparisons (tunable), and special-case optimizations for pseudo-Mersenne primes (e.g., secp256k1). Generic-prime variants are provided with higher gate counts.
  • Reported failure probabilities (e.g., ≤ 2-13.3 per point addition configuration) are acceptable for Shor’s single-run success; full-system success depends on parameter tuning and error-corrected execution.
  • Translating logical resources to physical qubits and runtime depends on the chosen error-correction code, gate fidelities, cycle times, and distillation throughput; current projections place real attacks in the long-term category.

Glossary

  • AND gates: A logical operation used in quantum circuit constructions; often counted with Toffoli/CCX-style non-Clifford costs. Example: "The ``Toffolis'' column counts together CCX, CCZ as well as And gates."
  • affine coordinates: A way to represent elliptic-curve points as pairs (x, y) over a field rather than using projective forms. Example: "Points are represented using affine coordinates, i.e., a pair of integers modulo qq."
  • ancilla qubits: Extra helper qubits used to facilitate reversible computations or reduce gate counts. Example: "depending on the number of ancilla qubits available."
  • Bézout coefficients: Integers r and s such that ru + sv equals the gcd(u, v); used here within the extended Euclidean process. Example: "the input part and the Bézout coefficients."
  • Bézout reconstruction: Recomputing the Bézout coefficients from a stored trace of Euclidean steps to effect inversion/multiplication. Example: "The second circuit is a Bézout reconstruction algorithm which takes as input the sequence of operations from the Euclidean algorithm"
  • binary Extended Euclidean Algorithm (EEA): A variant of the extended Euclidean algorithm using binary operations (shifts/subtractions) to compute inverses and coefficients. Example: "reversible variants of binary Extended Euclidean Algorithm (EEA)."
  • binary GCD algorithm: A shift-and-subtract algorithm for computing greatest common divisors; here related to reversible implementations. Example: "gives a reasonably efficient variant of binary GCD algorithm as used in previous implementations of elliptic curve point addition"
  • CDKM adder: A low-ancilla reversible adder (Cuccaro et al.) commonly used in quantum arithmetic. Example: "we simply switch between the CDKM adder~\cite{cuccaro2004new}, the Gidney adder~\cite{DBLP:journals/quantum/Gidney18} and a hybrid between the two"
  • Clifford+T gate set: A fault-tolerant gate set where T (non-Clifford) gates are costly; many metrics count T/Toffoli usage. Example: "are decomposed differently in the Clifford+T gate set."
  • controlled modular addition: Adding one modular operand to another conditioned on a control qubit. Example: "controlled modular addition, taken from~\cite{roetteler2017quantum}."
  • dirty ancillas: Ancilla qubits that may contain unknown garbage states but can still be safely reused in certain constructions. Example: "with dirty ancillas (which costs $3n$ Toffoli gates for nn-bit integers)."
  • dialog (representation): A compact record of Euclidean steps (divides, swaps, adds) enabling later reconstruction; term borrowed from prior work. Example: "This sequence of operations is named a dialog in~\cite{khattar2025verifiable}"
  • elliptic curve discrete logarithms: The problem of finding k such that Q = [k]P on an elliptic curve; target of Shor’s algorithm here. Example: "improved the cost of computing elliptic curve discrete logarithms"
  • Euclidean algorithm: Iterative method to compute greatest common divisors; here used in a reversible, bit-tracked form. Example: "We start from a simple, but efficient, implementation of the Euclidean algorithm."
  • fault-tolerant quantum architectures: Quantum computing frameworks that can correct errors during computation, constraining circuit design. Example: "the error correction layer of fault-tolerant quantum architectures."
  • Gidney adder: A measurement-based uncomputation adder that trades ancillas for fewer Toffoli gates. Example: "the Gidney adder~\cite{DBLP:journals/quantum/Gidney18}"
  • Gidney’s constant adder: A low-cost quantum adder optimized for adding a classical constant, often with dirty ancillas. Example: "Gidney's constant adder~\cite{gidney2025classical}"
  • Hadamard (gate): A single-qubit gate creating superpositions; appears in measurement-based uncomputation patterns. Example: "Other gates include not (X), controlled not (CX), Hadamard, phase flips (CZ)."
  • Jacobi symbol algorithm: Number-theoretic procedure for symbol computation; used here as an analogy for space–iteration trade-offs. Example: "very similar to the Jacobi symbol algorithm in~\cite{cryptoeprint:2026/280}"
  • ket notation: The |·⟩ Dirac notation used to denote quantum states. Example: "and the ket notation \ket{\cdot} of quantum states."
  • kickmix circuits: A class of quantum circuits using classical operations, phases, and measurements that remain efficiently simulable. Example: "describe a class of ``kickmix'' circuits which operate using classical gates, phases and measurements."
  • logical quantum circuits: Algorithm-level circuits abstracted from physical error-correction details, optimized for qubits and non-Clifford gates. Example: "We focus on the optimization of logical quantum circuits"
  • measurement-based uncomputation: Technique where measurements and classical feedforward remove garbage instead of coherent uncomputation. Example: "allow measurement-based uncomputation and phases"
  • Montgomery representation: A representation for modular arithmetic that speeds multiplication; explicitly avoided here. Example: "we do not rely on the Montgomery representation of modular integers"
  • modular doubling: Computing 2x mod q in a reversible circuit, often optimized for special moduli. Example: "modular doubling and controlled modular addition"
  • modular squaring: Computing x2 mod q; used in point addition formulas. Example: "a modular squaring, and two in-place multiplication circuits"
  • non-Clifford gates: Gates outside the Clifford group (e.g., T/Toffoli) that are expensive in fault-tolerant settings. Example: "These are the only non-Clifford gates in the circuits."
  • out-of-place multiplication: An operation that writes a product into a separate register rather than overwriting an input. Example: "out-of-place multiplication (x,y,zx,y,z+xy\ket{x,y,z} \mapsto \ket{x,y,z + xy})"
  • point at infinity: The identity element on an elliptic curve, denoted 𝒪 in group operations. Example: "or the point at infinity O\mathcal{O}."
  • pseudo-Mersenne prime: A prime of the form 2u − f with small f, enabling fast modular reduction. Example: "The prime in secp256k1 is a pseudo-Mersenne prime"
  • qubit recycling: Reusing qubits during semiclassical QFT to reduce space overhead. Example: "the semiclassical Fourier transform with qubit recycling"
  • secp256k1: A widely used Koblitz-like elliptic curve over a 256-bit prime field (used by Bitcoin). Example: "for the curve secp256k1."
  • semiclassical Fourier transform: A version of the QFT where measurements interleave the transform to reduce quantum resources. Example: "the semiclassical Fourier transform~\cite{griffiths1996semiclassical}"
  • Shor’s algorithm: Quantum algorithm that solves factoring and discrete log in polynomial time, threatening classical cryptography. Example: "Shor's algorithm represents the main threat of quantum computers to cryptography."
  • table lookup circuit: A reversible circuit that loads precomputed constants (e.g., point coordinates) indexed by an input. Example: "each time, we load these coordinates using a table lookup circuit"
  • Toffoli gate: A three-qubit controlled-controlled-NOT (CCX) gate; key non-Clifford resource often used as a cost metric. Example: "In previous works, Toffoli gates or T gates have been used as the main metric."
  • windowed point addition: Adding a point selected from a precomputed window [P0,…,P2w−1] to a running sum based on an index. Example: "the windowed point addition circuit does: iRiR+Pi\ket{i} \ket{R} \mapsto \ket{i} \ket{R + P_i}"
  • zero-knowledge proof: A cryptographic proof revealing no information beyond the validity of a claim; here used to attest circuit properties. Example: "relying instead on a zero-knowledge proof."

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 5 tweets with 353 likes about this paper.