Scaling the Automated Discovery of Quantum Circuits via Reinforcement Learning with Gadgets

Published 14 Mar 2025 in quant-ph | (2503.11638v1)

Abstract: Reinforcement Learning (RL) has established itself as a powerful tool for designing quantum circuits, which are essential for processing quantum information. RL applications have typically focused on circuits of small to intermediate complexity, as computation times tend to increase exponentially with growing circuit complexity. This computational explosion severely limits the scalability of RL and casts significant doubt on its broader applicability. In this paper, we propose a principled approach based on the systematic discovery and introduction of composite gates -- {\it gadgets}, that enables RL scalability, thereby expanding its potential applications. As a case study, we explore the discovery of Clifford encoders for Quantum Error Correction. We demonstrate that incorporating gadgets in the form of composite Clifford gates, in addition to standard CNOT and Hadamard gates, significantly enhances the efficiency of RL agents. Specifically, the computation speed increases (by one or even two orders of magnitude), enabling RL to discover highly complex quantum codes without previous knowledge. We illustrate this advancement with examples of QEC code discovery with parameters $ [[n,1,d]] $ for $ d \leq 7 $ and $ [[n,k,6]] $ for $ k \leq 7 $. We note that the most complicated circuits of these classes were not previously found. We highlight the advantages and limitations of the gadget-based approach. Our method paves the way for scaling the RL-based automatic discovery of complicated quantum circuits for various tasks, which may include designing logical operations between logical qubits or discovering quantum algorithms.