2000 character limit reached
A Normal Sequence Compressed by PPM$^*$ but not by Lempel-Ziv 78 (2009.04827v1)
Published 10 Sep 2020 in cs.DS, cs.IT, and math.IT
Abstract: In this paper we compare the difference in performance of two of the Prediction by Partial Matching (PPM) family of compressors (PPM$*$ and the original Bounded PPM algorithm) and the Lempel-Ziv 78 (LZ) algorithm. We construct an infinite binary sequence whose worst-case compression ratio for PPM$*$ is $0$, while Bounded PPM's and LZ's best-case compression ratios are at least $1/2$ and $1$ respectively. This sequence is an enumeration of all binary strings in order of length, i.e. all strings of length $1$ followed by all strings of length $2$ and so on. It is therefore normal, and is built using repetitions of de Bruijn strings of increasing order