A fast and simple $O (z \log n)$-space index for finding approximately longest common substrings

Published 24 Nov 2022 in cs.DS | (2211.13434v2)

Abstract: We describe how, given a text $T [1..n]$ and a positive constant $\epsilon$, we can build a simple $O (z \log n)$-space index, where $z$ is the number of phrases in the LZ77 parse of $T$, such that later, given a pattern $P [1..m]$, in $O (m \log \log z + \mathrm{polylog} (m + z))$ time and with high probability we can find a substring of $P$ that occurs in $T$ and whose length is at least a $(1 - \epsilon)$-fraction of the length of a longest common substring of $P$ and $T$.