A characterization of the number of subsequences obtained via the deletion channel (1202.1644v1)
Abstract: Motivated by the study of deletion channels, this work presents improved bounds on the number of subsequences obtained from a binary sting X of length n under t deletions. It is known that the number of subsequences in this setting strongly depends on the number of runs in the string X; where a run is a maximal sequence of the same character. Our improved bounds are obtained by a structural analysis of the family of r-run strings X, an analysis in which we identify the extremal strings with respect to the number of subsequences. Specifically, for every r, we present r-run strings with the minimum (respectively maximum) number of subsequences under any t deletions; and perform an exact analysis of the number of subsequences of these extremal strings.