ERA Revisited: Theoretical and Experimental Evaluation

Published 30 Sep 2016 in cs.DC and cs.DS | (1609.09654v1)

Abstract: Efficient construction of the suffix tree given an input text is an active area of research from the time it was first introduced. Both theoretical computer scientists and engineers tackled the problem. In this paper we focus on the fastest practical suffix tree construction algorithm to date, ERA. We first provide a theoretical analysis of the algorithm assuming the uniformly random text as an input and using the PEM model of computation with respect to the lower bounds. Secondly, we empirically confirm the theoretical results in different test scenarios exposing the critical terms. Thirdly, we discuss the fundamental characteristics of the input text where the fastest suffix tree construction algorithms in practice fail. This paper serves as a foundation for further research in the parallel text indexing area.