Quantum Centric Supercomputing (QCSC)
- Quantum Centric Supercomputing (QCSC) is a hybrid paradigm that integrates quantum processors with classical HPC resources to address computationally intractable tasks.
- It decouples telemetry from execution using asynchronous, non-blocking data collection, ensuring reproducibility and minimal computational overhead.
- The architecture leverages a structured framework of data collectors, telemetry bus, persistence layers, and analysis modules to enhance infrastructure-aware design and performance tracking.
Quantum-Centric Supercomputing (QCSC) describes an architectural and algorithmic paradigm in which quantum processors (QPU) and classical high-performance computing (HPC) resources are tightly integrated, enabling hybrid workflows that exploit quantum sampling for classically intractable subroutines and classical computation for heavy numerical tasks. QCSC workflows involve probabilistic execution on remote or on-premise quantum hardware, large-scale classical postprocessing, and systematic telemetry and observability for reproducibility and infrastructure-aware experimental design (Kanazawa et al., 5 Dec 2025).
1. Foundational Architecture and Workflow Separation
The core QCSC architecture is structured around the separation between the computing platform—where hybrid classical–quantum workflows run—and the observability platform—responsible for persistent telemetry collection and analysis. The architecture comprises four logical tiers (Kanazawa et al., 5 Dec 2025):
- Data Collectors: Embedded workflow agents intercept system-level (PBS job accounting, Qiskit Runtime metrics), task-level (wall-clock time, CPU utilization), and domain-level (circuit parameters, bitstring samples) telemetry.
- Telemetry Bus: RESTful APIs and S3-compatible object stores persist JSON-encoded events and large binary artifacts with strong consistency.
- Persistence Layer: Relational databases (PostgreSQL) catalog time-series events, job records, and artifacts; key–value indices enable efficient retrospective queries and avoid data duplication.
- Analysis Modules: ETL workflows and dashboarding frameworks (Superset) deliver high-level metric aggregation, visualization of both infrastructure-centric and domain-specific performance.
All telemetry traffic is decoupled from execution—calls are non-blocking and asynchronous, ensuring no performance overhead on quantum or classical job completions.
2. Workflow Execution and Telemetry Dataflow
The lifecycle of a QCSC experiment unfolds as:
- Workflow Submission: Prefect-driven flows mix QPU calls and HPC jobs; each task start time is recorded.
- Task Execution: QPU primitives invoke Qiskit Runtime; HPC jobs use PBS scripts and
qsub, with job IDs pulled into the telemetry stream. - Telemetry Emission: On task invocation and completion, JSON objects record start, end, exit code, and scheduler metrics (queue time, wall time, resources).
- Artifact Storage: L4 data, such as bitstring arrays , are compressed and uploaded to object storage; S3 pointers are emitted as event markers.
- ETL and Visualization: Observability platform consumes, transforms, and visualizes time-series metrics—energy convergence, queue latency, resource utilization—without interfering with computational execution.
This explicit dataflow enables persistent, reproducible tracking of algorithmic and infrastructure events. Telemetry is fully partitioned from the workload and is available for retrospective analysis.
3. Data Schemas, Performance Metrics, and Statistical Models
QCSC telemetry is structured by normalized relational schemas:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
CREATE TABLE TaskEvent ( event_id SERIAL PRIMARY KEY, task_uuid UUID NOT NULL, event_type TEXT NOT NULL, ts TIMESTAMP NOT NULL, payload JSONB NULL ); CREATE TABLE JobRecord ( job_id TEXT PRIMARY KEY, platform TEXT, submit_time TIMESTAMP, start_time TIMESTAMP, end_time TIMESTAMP, allocated_quota JSONB, used_quota JSONB ); CREATE TABLE DomainArtifact ( run_id UUID, iteration INT, population_id INT, name TEXT, s3_key TEXT ); |
Major performance metrics:
- Latency :
- Circuit Fidelity : (accounts for noise mitigation/post-selection)
- Trial Cost : $C = \alpha \cdot \text{number_of_shots} + \beta \cdot \text{HPC_token_usage}$
Domain-specific metrics in SQD workflows:
- Carryover acquisition :
- Parameter convergence : Mean pairwise Euclidean distance for parameter vectors
- Hamming distance : From restricted Hartree-Fock (RHF) reference, tracks orbital occupation deviations
- Sample preservation ratio :
Statistical analysis includes Bayesian binomial models for shot retention and rolling-window statistics for runtime drift. Outlier detection uses thresholded deviations ().
4. Persistent Monitoring, Reproducibility, and Redundant Run Elimination
All telemetry and artifacts are indexed by run, iteration, and fingerprint, supporting systematic reproducibility. Redundant runs are proactively suppressed by existence checking in DomainArtifact table:
1 2 3 4 5 |
def should_run(run_id, iteration, fingerprint): if exists(DomainArtifact where run_id=run_id and iteration=iteration and fingerprint=fingerprint): return False # Existing results - skip execution else: return True |
If results are present, job execution is skipped and logs are reused. New metrics can be derived post-hoc by updating the ETL pipeline—re-execution is never required. This ensures that no parameter set is executed more than once, eliminating waste and guaranteeing experimental traceability.
5. Representative Case Study: Sample-Based Quantum Diagonalization
In the closed-loop SQD workflow on a [Fe₄S₄(SCH₃)₄]²⁻ cluster:
- Experiment: 20 Differential Evolution (DE) iterations, 4 populations, yielding 80 QPU and 80 HPC jobs.
- Telemetry Findings:
- Ground-state energy convergence to –326.72 Ha, with residual nondeterminism 0.005 Ha.
- Carryover acquisition drops to zero after 10 iterations (dominant determinants stabilized).
- Parameter convergence remains non-shrinking, indicating aggressive search or persistent multimodality.
- Hamming distance orbitals, quantifying departure from RHF.
- Sample preservation , reflecting strong hardware-induced noise.
- Orbital occupation plots show increasing LUMO occupancy, affirming physical relevance.
Without observability, early dropout or low shot retention would be detectable only through exhaustive manual inspection. The framework allowed transparent, quantitative analysis of algorithmic and hardware behavior.
6. Implications for Infrastructure-Aware Design, Transparency, and Experimentation
The decoupling of telemetry from execution in QCSC yields several benefits:
- Full transparency: All scheduler events, circuit parameters, and measurement samples are queryable across hardware/software boundaries.
- Infrastructure-aware design: Runtime variability (queue times, resource utilization) can be fed back to guide scheduling policy and platform co-design.
- Systematic experimentation: Run metadata are first-class, indexed objects—hyperparameter sweeps, reproducibility checks, and algorithmic benchmarking are automated and searchable.
- Reproducibility: Hidden or redundant computation is eliminated; analyses and metrics are always derivable from persistently stored results.
This reliable observability transforms "opaque, trial-and-error pipelines into fully instrumented, data-driven experiments" (Kanazawa et al., 5 Dec 2025), closing the loop between quantum hardware, classical supercomputing, and algorithmic insight. The measure–analyze–refine cycle becomes central to QCSC application development and platform deployment, facilitating the transition to scalable, reproducible quantum-centric computation.
Table: Key QCSC Observability Architecture Components
| Tier | Purpose | Technologies Used |
|---|---|---|
| Data Collectors | Telemetry acquisition (L2–L4) | Prefect, Qiskit Runtime, PBS |
| Telemetry Bus | Artifact/event ingestion/storage | REST API, S3-compatible object store |
| Persistence Layer | Durable storage, indexing | PostgreSQL, S3 |
| Analysis Modules | Metric computation, visualization | Prefect ETL, Superset |
The described observability framework for QCSC enables persistent, zero-overhead monitoring and analysis, reproducibility, and systematic algorithm/infrastructure co-design at scale (Kanazawa et al., 5 Dec 2025).