Circulant-STRING: Quantum String Processing
- Circulant-STRING is a quantum technique that employs circulant matrices to encode and generate quantum superpositions of cyclic shifts or suffixes for a given string.
- It leverages the Quantum Fourier Transform and controlled-phase gates to construct an O((log n)²)-depth circuit, substantially improving pattern matching and substring search.
- Applications include genomic indexing, sequence alignment, and the Burrows–Wheeler Transform, offering a promising alternative to classical suffix data structures.
Circulant-STRING refers to the quantum implementation of string processing operations using circulant matrices as quantum operators. By leveraging the algebraic and spectral properties of circulant matrices, this technique enables the efficient generation and manipulation of all cyclic shifts (and, via end-markers, all suffixes) of a string when encoded as a quantum state. The central construction provides an O((log n)²)-depth quantum circuit for producing a uniform superposition over all string cyclic shifts, a fundamental primitive in quantum algorithms for string processing, substring search, and related problems.
1. Formal Definition: Circulant Matrices and Quantum String Encodings
An circulant matrix is defined by its first row :
Given a string , encode as the computational basis state . The action of on viewed as an -component vector generates all cyclic shifts of 0 as rows. For a “coefficient” vector 1 corresponding to a one-hot encoding of 2 (or as a vector of character amplitudes), 3 contains the information of all cyclic shifts in superposition, establishing the core primitive underlying Circulant-STRING (Daskin, 2022).
2. Quantum Circuit Construction and Algorithmic Implementation
Circulant matrices are diagonal in the discrete Fourier basis, 4, where 5 is the 6 discrete Fourier transform (DFT) matrix and 7 contains the eigenvalues. On a quantum computer:
- Quantum Fourier Transform (QFT): 8 and 9 are implemented as QFT and its inverse over 0 qubits, each requiring depth and gate count 1 using standard decompositions.
- Diagonal Operator 2: Implemented via controlled-phase gates. For coefficient vectors 3 that are known and sparse, this requires 4 gates; for arbitrary 5, 6 gates.
- Permutation Operator Approach: Using the representation 7, where 8 is the cyclic permutation operator, construct a block-diagonal operator 9, controlled by an ancilla register prepared in 0. This allows efficient realization of powers 1 using single-qubit phase rotations 2.
A two-register quantum circuit is thus realized with gate count and depth 3, dominated by QFT application. Summary of resources:
| Component | Gate Count | Depth |
|---|---|---|
| QFT + Inverse QFT | 4 | 5 |
| Controlled-phase network | 6 | 7 |
| Total | 8 | 9 |
3. Generating Superpositions of String Suffixes
Preparation of the quantum superposition of suffixes employs two quantum registers: an ancilla (initialized to 0) and the main register holding 1. Applying a Hadamard transform (2) to the ancilla yields a uniform superposition 3. The operator 4 then maps 5, resulting in
6
where 7 is the 8th cyclic shift of 9. Inserting a special end-marker symbol (such as ‘\$s$0s$s$1s = \text{“banana\$s$2 (with $s$3), the circuit generates a uniform superposition over all rotated suffixes (Daskin, 2022):
$s$4
This transformation enables quantum parallelism over all suffixes.
4. Comparison with Classical Suffix Data Structures
Classical suffix arrays or trees are constructed in $s$5 or $s$6 time—e.g., by Ukkonen’s algorithm—requiring $s$7 memory. By contrast, the Circulant-STRING quantum circuit operates with $s$8 depth and $s$9-qubit registers plus ancillas, provided the string is already loaded in $|s\rangle \equiv |s_0\rangle \otimes |s_1\rangle \otimes \cdots \otimes |s_{n-1}\rangle$0. However, loading the classical input string into the quantum register has a naive cost $|s\rangle \equiv |s_0\rangle \otimes |s_1\rangle \otimes \cdots \otimes |s_{n-1}\rangle$1, so the end-to-end quantum complexity is $|s\rangle \equiv |s_0\rangle \otimes |s_1\rangle \otimes \cdots \otimes |s_{n-1}\rangle$2. Once the state is prepared, queries that depend on accessing all suffixes in superposition (such as pattern matching using amplitude amplification) achieve sub-classical time complexity. A major limitation remains: direct measurement collapses the superposition, so reading out all $|s\rangle \equiv |s_0\rangle \otimes |s_1\rangle \otimes \cdots \otimes |s_{n-1}\rangle$3 suffixes is not possible faster than $|s\rangle \equiv |s_0\rangle \otimes |s_1\rangle \otimes \cdots \otimes |s_{n-1}\rangle$4; only global properties can be extracted more efficiently via quantum subroutines (Daskin, 2022).
5. Applications, Operational Assumptions, and Requirements
Applications
- Pattern matching and substring search: Quantum parallelism enables simultaneous querying across all possible suffixes.
- Burrows–Wheeler Transform (BWT): Fast quantum generation of all $|s\rangle \equiv |s_0\rangle \otimes |s_1\rangle \otimes \cdots \otimes |s_{n-1}\rangle$5 cyclic rotations, a bottleneck in BWT, in $|s\rangle \equiv |s_0\rangle \otimes |s_1\rangle \otimes \cdots \otimes |s_{n-1}\rangle$6 time (with $|s\rangle \equiv |s_0\rangle \otimes |s_1\rangle \otimes \cdots \otimes |s_{n-1}\rangle$7 preloaded).
- Genomic or large text database indexing: Superposition allows for rapid membership, common-prefix, or alignment queries.
- Sequence alignment and convolutional detection: The circulant-matrix approach naturally implements circular convolution, relevant for a variety of sequence analysis tasks.
Requirements and Assumptions
- The input string must be prepared as a quantum state $|s\rangle \equiv |s_0\rangle \otimes |s_1\rangle \otimes \cdots \otimes |s_{n-1}\rangle$8 in $|s\rangle \equiv |s_0\rangle \otimes |s_1\rangle \otimes \cdots \otimes |s_{n-1}\rangle$9 time or via efficient QRAM.
- Circuit operation necessitates coherence and low-error control of $C$0 qubits; high-precision phase rotations ($C$1) must have error $C$2.
- Post-processing steps (e.g., extracting lexicographic order, sorting amplitude values) are typically realized by additional quantum-classical hybrid routines, such as amplitude amplification.
The Circulant-STRING construction leverages the algebraic tractability of circulant matrices to supply a universal and quantum-efficient building block for fast string and sequence processing primitives (Daskin, 2022). A plausible implication is that for string-processing tasks with highly nonlocal dependence, Circulant-STRING may become the default quantum primitive, conditional on further advances in quantum state preparation technology.