Linear time construction of compressed text indices in compact space (1401.0936v3)
Abstract: We show that the compressed suffix array and the compressed suffix tree for a string of length $n$ over an integer alphabet of size $\sigma\leq n$ can both be built in $O(n)$ (randomized) time using only $O(n\log\sigma)$ bits of working space. The previously fastest construction algorithms that used $O(n\log\sigma)$ bits of space took times $O(n\log\log\sigma)$ and $O(n\log{\epsilon}n)$ respectively (where $\epsilon$ is any positive constant smaller than $1$). In the passing, we show that the Burrows-Wheeler transform of a string of length $n$ over an alphabet of size $\sigma$ can be built in deterministic $O(n)$ time and space $O(n\log\sigma)$. We also show that within the same time and space, we can carry many sequence analysis tasks and construct some variants of the compressed suffix array and compressed suffix tree.