Quantum Speed-ups for String Synchronizing Sets, Longest Common Substring, and k-mismatch Matching (2211.15945v1)
Abstract: Longest Common Substring (LCS) is an important text processing problem, which has recently been investigated in the quantum query model. The decisional version of this problem, LCS with threshold $d$, asks whether two length-$n$ input strings have a common substring of length $d$. The two extreme cases, $d=1$ and $d=n$, correspond respectively to Element Distinctness and Unstructured Search, two fundamental problems in quantum query complexity. However, the intermediate case $1\ll d\ll n$ was not fully understood. We show that the complexity of LCS with threshold $d$ smoothly interpolates between the two extreme cases up to $n{o(1)}$ factors: LCS with threshold $d$ has a quantum algorithm in $n{2/3+o(1)}/d{1/6}$ query complexity and time complexity, and requires at least $\Omega(n{2/3}/d{1/6})$ quantum query complexity. Our result improves upon previous upper bounds $\tilde O(\min {n/d{1/2}, n{2/3}})$ (Le Gall and Seddighin ITCS 2022, Akmal and Jin SODA 2022), and answers an open question of Akmal and Jin. Our main technical contribution is a quantum speed-up of the powerful String Synchronizing Set technique introduced by Kempa and Kociumaka (STOC 2019). It consistently samples $n/\tau{1-o(1)}$ synchronizing positions in the string depending on their length-$\Theta(\tau)$ contexts, and each synchronizing position can be reported by a quantum algorithm in $\tilde O(\tau{1/2+o(1)})$ time. As another application of our quantum string synchronizing set, we study the $k$-mismatch Matching problem, which asks if the pattern has an occurrence in the text with at most $k$ Hamming mismatches. Using a structural result of Charalampopoulos, Kociumaka, and Wellnitz (FOCS 2020), we obtain a quantum algorithm for $k$-mismatch matching with $k{3/4} n{1/2+o(1)}$ query complexity and $\tilde O(kn{1/2})$ time complexity.