2000 character limit reached
Sets Represented as the Length-n Factors of a Word (1304.3666v1)
Published 12 Apr 2013 in cs.FL, cs.DM, and math.CO
Abstract: In this paper we consider the following problems: how many different subsets of Sigman can occur as set of all length-n factors of a finite word? If a subset is representable, how long a word do we need to represent it? How many such subsets are represented by words of length t? For the first problem, we give upper and lower bounds of the form alpha2n in the binary case. For the second problem, we give a weak upper bound and some experimental data. For the third problem, we give a closed-form formula in the case where n <= t < 2n. Algorithmic variants of these problems have previously been studied under the name "shortest common superstring".