Papers
Topics
Authors
Recent
2000 character limit reached

Quantum-Centric Supercomputing Workflows

Updated 9 December 2025
  • Quantum-centric supercomputing workflows are hybrid systems that integrate quantum processors with classical HPC to enable reproducible and scalable computation.
  • They employ multi-stage pipelines that partition tasks into HPC pre-processing, quantum kernel submission, and classical post-processing for optimal efficiency.
  • These workflows utilize DAG-based orchestration, unified scheduling, and dynamic parallelism to mitigate quantum latency and enhance throughput.

Quantum-centric supercomputing (QCSC) workflows denote computational orchestration paradigms and architectures in which quantum processors (QPUs) and conventional high-performance computing (HPC) resources operate as tightly coupled, first-class computational entities. Rather than treating quantum hardware as an isolated novelty or wholesale replacement for classical supercomputing, QCSC positions QPUs as subsystems accessed as accelerators, analogous to GPUs or FPGAs, within systematically managed, reproducible, and scalable scientific workflows. This integrated design exploits complementary classical/quantum strengths, dispatching quantum-amenable kernels while leveraging decades of HPC advancements for classical task orchestration, parallelism, data management, error resilience, and scheduling (Bieberich et al., 2023).

1. Fundamental QCSC Workflow Patterns and Architectural Layers

QCSC workflows typically manifest as multi-stage pipelines, each stage apportioned to classical or quantum hardware according to algorithmic need, resource models, and synchronization constraints. The canonical QCSC architecture comprises three organizational domains (Bieberich et al., 2023):

  • HPC Pre-processing: Classical compute nodes or clusters generate scientific problem instances, perform data preparation (e.g., matrix assembly, motif extraction, molecular orbital construction), and serialize quantum circuit descriptions or kernels (commonly in OpenQASM or QIR).
  • Quantum Kernel Submission: QPU-bound tasks, often managed via workflow engines or asynchronous threadpools, submit serialized circuits to cloud or on-premises quantum backends, monitor execution (with error handling/retry in case of transient issues), and collect measurement results.
  • Post-processing: Final classical tasks decode quantum measurement output, apply problem-specific logic (e.g., decoding phase estimation results, analyzing amplitude statistics, inferring physical quantities), and aggregate results for downstream analysis.

This partitioned model is mapped to a layered software stack featuring, from top to bottom: a workflow management library (e.g., Parsl, Prefect), resource management (SLURM with QPU-aware plugins or GRES tags), quantum programming APIs (native and Python/C++), and handler modules for each backend QPU or vendor’s interface (Bieberich et al., 2023, Shehata et al., 3 Mar 2025).

2. Task Orchestration and Workflow Management Approaches

Task orchestration in QCSC workflows leverages explicit directed acyclic graphs (DAGs) of tasks, structured via workflow engines such as Parsl or hybrid orchestration APIs (Bieberich et al., 2023, Shehata et al., 3 Mar 2025). Key design decisions include:

  • Executor Separation: Classical pre-/post-processing tasks are often assigned to dedicated HPC nodes through batch schedulers (e.g., via SlurmExecutor), while quantum job submissions are multiplexed using lightweight thread pools to maximize API parallelism and overlap quantum queue or execution delays.
  • File-Based Data Passing: Serialization of circuit flows (input problem, circuit description, results) is typically performed via intermediate JSON or Qobj files, staged on parallel filesystems accessible to all nodes—ensuring reproducibility and data locality.
  • Dynamic Parallelism: Multiple quantum circuits (e.g., eigenstate variants, problem permutations) can be submitted in parallel, bounded only by device quotas or user limits. Parallel execution mitigates quantum queue latency, especially when circuits are small or the number of shots modest.
  • Transparent Credential Management: Integration with SDK credential stores (e.g., Qiskit’s ~/.qiskit) avoids embedding security-sensitive tokens, facilitating compliant remote execution under shared tenant environments (Bieberich et al., 2023).

Workflow-level DAG construction ensures that task dependencies, data flow, and hardware assignment are statically governed yet amenable to dynamic scaling and retry.

3. Resource Provisioning, Scheduling, and Queue Management

Efficient resource co-allocation and scheduling represent critical QCSC workflow competencies, addressing the non-trivial mismatch between the low-throughput, high-latency nature of quantum devices, and the high-throughput parallelism of leadership-class HPC systems (Shehata et al., 3 Mar 2025, Bieberich et al., 2023). Representative features include:

  • Unified Resource Management: A single scheduler (e.g., SLURM with QPU-aware GRES plugins) oversees both classical and quantum allocations, supporting simultaneous or interleaved reservation of heterogeneous resources (Shehata et al., 3 Mar 2025).
  • Job Splitting and MPI Integration: Monolithic jobs that hold QPUs idle during classical compute phases are disaggregated into sequential sub-jobs tied to quantum access points. High-level workflow scripts (e.g., SBATCH scripts with per-stage dependencies) and MPI dynamic process management orchestrate quantum kernel submission without long-lived QPU reservation (Bieberich et al., 2023).
  • Error Handling: Quantum task submissions are wrapped with retry decorators, leveraging exponential back-off for transient failures in HTTP or cloud queues, to maximize eventual completion rates (Bieberich et al., 2023).
  • Throughput Maximization: Classical pre-/post-processing is overlapped with quantum invocation wherever possible. Because quantum device queue times can vastly exceed compute duration (e.g., 3–6 hours for a 2.5-minute quantum kernel), maximizing concurrency and pre-loading the quantum submission pipeline is essential for practical throughput.

4. Algorithmic and Workflow Case Studies

QCSC workflows have been demonstrated for a diversity of quantum algorithms, with domain-specific workflow structures:

  • Grover’s Search: Parallel quantum circuits are instantiated for all oracle configurations and dispatched; post-processing statistically recovers the solution with >90% accuracy on real backends with a small quantum register (e.g., four qubits) (Bieberich et al., 2023).
  • Shor’s Algorithm: Classical preprocessing selects algorithmic parameters, then quantum circuits for period finding are synthesized, run, and analyzed, with heavy reliance on classical post-processing to infer nontrivial factors.
  • Quantum Phase Estimation for Optimization: Traveling Salesman Problem (TSP) instances are generated classically and encoded via quantum phase estimation routines. Quantum “phase” register measurement yields statistical evidence of shortest routes, and the algorithm is fully automated in a classical–quantum–classical DAG (Bieberich et al., 2023).

These exemplars illustrate both the flexibility and reproducibility attainable with hybrid workflow encoding, and reinforce the need for precise dataflow, parametrization, and result demultiplexing in high-throughput QCSC pipelines.

5. Performance, Scalability, and Reliability Considerations

Empirical evidence from deployed QCSC workflows highlights the following operational factors (Bieberich et al., 2023):

  • Preprocessing Overhead: Generating problem instances and building/serializing quantum circuits is typically rapid (<5–10 seconds per problem seed, for moderate-sized problems).
  • Quantum Backend Latency: Real-device quantum submission is the dominant time sink due to queueing (3–6 hours for peak IBMQ, with actual gate execution on the order of minutes). Simulators offer much faster return (<2 minutes).
  • Orchestration Cost: Workflow management libraries introduce negligible overhead (<100 ms per task) for DAG construction and dispatch.
  • Error Handling: Workflows must robustly handle queue timeouts, intermittent device unavailability, and API failures. Structured retry logic significantly reduces spurious failures due to transient conditions.
  • Concurrency Limits: The scale of quantum parallelism is ultimately capped by device quotas (e.g., maximum concurrent jobs per user) and batch scheduler policies. Nonetheless, workflows scale effectively up to hundreds of simultaneous quantum kernels, with post-processing by classical tasks distributed over HPC resources (Bieberich et al., 2023).
  • Best Practices: Data locality (collocating scratch storage), workflow parallelism (one quantum task per eigenstate or problem permutation), tight error/certificate management, and decoupling execution credentials from job scripts are emphasized for robust, scalable adoption in leadership-class computing environments.

6. Implications, Challenges, and Future Directions

QCSC workflows represent a paradigm shift in scientific computing, embracing quantum processors as task-specific accelerators in reproducible, declarative pipelines rather than as holistic replacements for HPC. This model enables systematic experimentation, reproducibility, and transparent performance evaluation across hybrid resources (Bieberich et al., 2023). Key challenges ahead include:

  • Queue Management: Effective scheduling and resource sharing to address the high-latency, low-throughput nature of current quantum hardware.
  • Unified Programming Models: Seamless integration—abstracting away device specifics—remains an open objective, necessitating continued advancement in programming APIs, intermediate representations, and compiler/language support.
  • Scaling and Fault Tolerance: Ensuring robust recovery, checkpointing, and job retry at both classical and quantum stages is imperative for scientific reliability and resource efficiency.
  • Dynamic Dataflow: As quantum hardware matures and new device modalities emerge, QCSC workflow systems must adapt to dynamically refactor orchestration and task allocation strategies to maintain optimal system utilization and scientific throughput.

By deploying quantum-circuit tasks as first-class workflow entities, modern QCSC frameworks set the foundation for future large-scale, cross-disciplinary hybrid computation, in which quantum resources are invoked as needed and classical computation orchestrates the full data movement, error management, and results aggregation pipeline in an integrated supercomputing environment (Bieberich et al., 2023).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (2)

Whiteboard

Follow Topic

Get notified by email when new papers are published related to Quantum-Centric Supercomputing (QCSC) Workflows.