NWQWorkflow: Modular Quantum Computing Platform
- NWQWorkflow is an end-to-end modular stack for quantum computing that integrates software and hardware components for comprehensive QIS research.
- It features a layered toolchain utilizing standardized NWQASM, JSON device descriptions, and C++/Python bindings to enable efficient algorithm design, compilation, and simulation.
- Its closed-loop AI-driven benchmarking and feedback mechanisms refine error correction, pulse optimization, and hardware execution for scalable quantum applications.
NWQWorkflow is an end-to-end software–hardware stack designed for quantum application development, compilation, error correction, benchmarking, simulation, quantum control, and execution on a prototype superconducting testbed. Developed over eight years at Pacific Northwest National Laboratory (PNNL), it integrates multiple modular components to support closed-loop software–hardware co-design, facilitating the transition to scalable quantum supercomputing. The workflow combines a layered toolchain—from algorithm design and circuit compilation to device execution and feedback—enabling rigorous, iterative quantum information science (QIS) experimentation and development (Li, 21 Jan 2026).
1. Architecture and Data Flow
NWQWorkflow is organized as a multi-layer toolchain, with each layer interfacing through standardized intermediate representations and datasets.
At the top level is NWQStudio, a PyQt5-based integrated development environment (IDE). NWQStudio consolidates graphical quantum circuit editing, AI-assisted experiment planning, compiler invocation, simulator submission, benchmarking, and device execution within a unified user interface. The typical workflow consists of the following data flow:
- Algorithm Design: NWQStudio invokes NWQLib to generate quantum algorithm circuits, output as NWQASM files (an extended OpenQASM 2.0 format).
- Compilation: These NWQASM representations are processed by two possible compilers: QASMTrans for Noisy Intermediate-Scale Quantum (NISQ) devices or NWQEC for fault-tolerant quantum computing (FTQC). Both transpilers accept NWQASM plus device descriptions (JSON) as inputs.
- Benchmarking and Simulation: Resulting physical-gate circuits are analyzed by QASMBench, which computes various metrics (e.g., gate count, depth, entanglement variance, fidelity). Circuits are then simulated in silico with NWQSim, supporting several backends (state-vector, density-matrix, tensor network, stabilizer).
- Quantum Control: NWQControl translates gate-level circuits into I/Q pulse waveforms, targeting the QICK framework on Xilinx RFSoC hardware.
- Hardware Execution: The final compiled pulse programs execute on NWQSC, a prototype superconducting-qubit testbed.
AI-driven feedback is implemented at multiple stages: benchmarking results and noisy density-matrix simulations inform updates to device noise models (NWQData), which in turn tune compilation schedules and pulse shaping before the next hardware run. This closed-loop mechanism supports co-design at each layer of the stack.
All stack elements standardize interoperability through three primary formats:
- NWQASM ("*.qasm"): text-based, extended OpenQASM 2.0 representation.
- JSON device descriptions: contain hardware connectivity graphs, gate decompositions, operation durations, T₁/T₂, and crosstalk.
- C++/Python bindings: each component exposes Python layers for orchestration and scripting, while retaining HPC performance.
2. Core Components
NWQWorkflow's modular architecture is defined by specialized components integrated via API and data-format conventions.
NWQStudio (IDE)
- Python/PyQt5-based interface for circuit design, experiment planning, simulation, benchmarking, and execution.
- Incorporates an AI agent for adaptive experiment calibration and noise model refinement.
NWQASM (Intermediate Representation)
- 100% OpenQASM 2.0 BNF grammar compatibility with extensions for:
- QRAM operations: e.g.,
qload,qstore. - Quantum network primitives:
qsend,qrecv. - Explicit timing:
delay(q, cycles), global clock domain. - Planned binary encoding (".nwasm") accommodates large-scale (>100 MB) circuits.
- QRAM operations: e.g.,
QASMTrans (NISQ Transpiler)
- Pipeline: gate-set decomposition, greedy lookahead qubit mapping ( per two-qubit gate), A*-based routing (), ASAP/ALAP scheduling.
- Outputs hardware-compatible NWQASM with physical indices and gate durations.
NWQEC (Fault-Tolerant Compiler)
- Clifford+T path (TACO): grid-synthesis and T-optimization based on Ross–Selinger techniques.
- Pauli-based computation (TQC): adaptive Pauli measurement scheduling, tableau reduction, and
Tfusepass for up to 30% T-count reductions. - FTQC code descriptions (e.g., surface code, , ) are foundational, with detailed theory in preparatory works.
QASMBench (Benchmark Suite)
- ∼60 OpenQASM circuits spanning chemistry, optimization, arithmetic, ML, and cryptography.
- Key metrics: GateCount, CircuitDepth, GateDensity, EntanglementVariance, MeasurementDensity, Fidelity, Quantum Volume.
- Full workflow: transpile → simulate (SV/DM/TN/STAB) → benchmark → hardware execution (optional).
NWQSim (HPC Simulation Suite)
- State-vector (SV-Sim): Multi-node MPI/NVSHMEM scaling (4,096 GPUs at ).
- Density-matrix (DM-Sim): Includes T₁/T₂, depolarization, readout modeling; on 4,096 GPUs.
- Tensor network (TN-Sim): MPS, TAMM/ITensor backend, scaling.
- Stabilizer (STAB-Sim): GPU-accelerated Clifford, 10–100× faster than prior CPU-based tools.
NWQControl (Quantum Control)
- Generates and optimizes I/Q waveforms for QICK framework, with built-in QuTiP solver for decoherence/crosstalk validation.
- Gate-sequence merging reduces pulse latency by ~20%.
- Future extensions: closed-loop calibration ("CaliQEC") for drift compensation.
NWQSC (Superconducting Testbed)
- QICK-based RFSoC driving up to 32 fixed-frequency transmons (heavy-hex lattice, ).
- Specs: μs, μs, single-qubit error , CX error .
- Device data archived in NWQData for feedback-integrated operation.
3. Compilation, Simulation, and Benchmarking Methodologies
Compilation methodology in NWQWorkflow utilizes device-aware, resource-constrained mapping and scheduling:
- QASMTrans offers O() scaling for gate-count reduction in typical device topologies, supporting gates on qubits in less than 60 seconds on a 16-core node.
- NWQEC provides both grid-synthesis (Clifford+T, TACO) and Pauli-based measurement sequencing (TQC), with tableau optimizations and T-count reductions proven on benchmark suites.
- QASMBench enables standardized, multi-metric characterization. Metrics such as fidelity () and quantum volume are computed from simulation backends and compared to hardware when available.
Simulation exploits several model types:
- SV-Sim and DM-Sim scale to 42-qubit (state-vector) and 21-qubit (density-matrix) circuits, leveraging thousands of GPUs.
- TN-Sim (MPS, bond-dimension ) enables efficient simulation of low-entanglement problems.
- STAB-Sim accelerates Clifford circuits with large numbers of measurements (e.g., ) over multi-GPU configurations.
4. Hardware Integration and Quantum Control
The NWQSC superconducting testbed is a full-stack platform comprising:
- Control Electronics: QICK (Quantum Instrumentation Control Kit) on Xilinx RFSoC, supporting real-time I/Q waveform emission to drive up to 32-qubit arrays.
- Cryogenics: Dilution refrigerator achieves base temperatures near 10 mK.
- QPU Architecture: Heavy-hexagon lattice of fixed-frequency transmons; device topology, gate fidelities, and coherence statistics managed in NWQData for software–hardware synchronization.
- Pulse Programming: NWQControl leverages optimal control and sequence merging, incorporates QuTiP pulse validation, and plans for automatic calibration routines to mitigate drift and maintain operational fidelity.
Device configuration and characterization data continuously inform compiler and control layers through structured JSON datasets and are accessible for AI-driven calibration within NWQStudio.
5. Open-Source Ecosystem and Modularity
NWQWorkflow's impact derives from its open-source, modular ecosystem, which emphasizes:
- Permissive Licensing: Majority of components are Apache 2.0, MIT, or BSD-3 licensed, with the intent to maximize collaborative development—component-specific access details are given in the following overview:
| Component | Implementation | Status |
|---|---|---|
| NWQStudio | Python | Planned release |
| NWQASM | OpenQASM2 | In progress |
| QASMTrans | C++/Python | Released |
| NWQEC | C++/Python | Released |
| QASMBench | OpenQASM2 | Released |
| NWQSim | C++/Python | Released |
| NWQLib | Qiskit | In progress |
| NWQData | Text/JSON | In progress |
| NWQControl | C++/Python | Planned release |
| NWQSC | N/A (hardware) | Internal PNNL |
- Loose Coupling: Each major layer relies only on standardized IRs and APIs, enabling replacement (e.g., alternate transpilers, simulation backends, or control solutions) without disruption to the rest of the workflow.
- Minimal Dependencies: Externally, NWQStudio requires only PyQt5 beyond standard Python and C++ packages, limiting installation barriers and runtime overhead.
- Community Orientation: The stack is structured for multidisciplinary accessibility, supporting research efforts in compiler design, control, hardware, and quantum algorithm development with low entry barriers.
- Feedback-Driven Co-Design: The built-in closed-loop system and AI feedback integrate device benchmarking, simulation discrepancies, and calibration drift into an automated, iterative optimization cycle.
6. Performance Envelope and Use Cases
Performance metrics underscore NWQWorkflow's readiness for large-scale QIS:
- QASMTrans compiles million-gate, thousand-qubit circuits in under a minute (16-core node).
- NWQSim simulates a 42-qubit circuit in 1–2 seconds per gate layer (1,024 A100 GPUs, ~5 TB RAM); density-matrix simulation for 21 qubits in under 10 seconds per layer (4,096 A100 GPUs).
- Tensor network simulations (χ=128) of chemistry circuits (100 gates) complete in under 30 seconds (64 CPU nodes).
- STAB-Sim demonstrates speedup over Google Stim on Clifford circuits with measurements (8 GPUs).
- Representative workflow: End-to-end variational quantum eigensolver (VQE) for benzene (, ) completes in about 3 hours including compilation, calibration, and measurement repetitions on Perlmutter.
These benchmarks situate NWQWorkflow at the performance frontier for comprehensive quantum software–hardware workflows, emphasizing scalability and practical applicability for near-term and future fault-tolerant quantum computing research.
7. Significance and Prospects
By centralizing robust, multi-layer co-design, high-performance simulation, and modular software–hardware integration, NWQWorkflow provides a blueprint for scalable, community-driven QIS platforms. The architecture facilitates rapid prototyping and rigorous benchmarking of algorithms, compilers, and control strategies, supporting Department of Energy–scale, multi-tenant quantum facility deployments. Adoption of community-friendly licensing, standard schema definitions, and open distribution aims to accelerate collaboration and innovation in quantum computing research (Li, 21 Jan 2026). The extensible, feedback-integrated approach embodies contemporary best practices, supporting both immediate NISQ experimentation and the systematic exploration of fault-tolerant quantum architectures.