Papers
Topics
Authors
Recent
2000 character limit reached

Low-Level and NUMA-Aware Optimization for High-Performance Quantum Simulation

Published 10 Jun 2025 in quant-ph and cs.AR | (2506.09198v1)

Abstract: Scalable classical simulation of quantum circuits is crucial for advancing both quantum algorithm development and hardware validation. In this work, we focus on performance enhancements through meticulous low-level tuning on a single-node system, thereby not only advancing the performance of classical quantum simulations but also laying the groundwork for scalable, heterogeneous implementations that may eventually bridge the gap toward noiseless quantum computing. Although similar efforts in low-level tuning have been reported in the literature, such implementations have not been released as open-source software, thereby impeding independent evaluation and further development. We introduce an open-source, high-performance extension to the QuEST simulator that brings state-of-the-art low-level and NUMA optimizations to modern computers. Our approach emphasizes locality-aware computation and incorporates hardware-specific optimizations such as NUMA-aware memory allocation, thread pinning, AVX-512 vectorization, aggressive loop unrolling, and explicit memory prefetching. Experiments demonstrate significant speedups - 5.5-6.5x for single-qubit gate operations, 4.5x for two-qubit gates, 4x for Random Quantum Circuits (RQC), and 1.8x for Quantum Fourier Transform (QFT), demonstrating that rigorous performance tuning can substantially extend the practical simulation capacity of classical quantum simulators on current hardware.

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.