A Universal Grammar-Based Code For Lossless Compression of Binary Trees (1304.7392v1)
Abstract: We consider the problem of lossless compression of binary trees, with the aim of reducing the number of code bits needed to store or transmit such trees. A lossless grammar-based code is presented which encodes each binary tree into a binary codeword in two steps. In the first step, the tree is transformed into a context-free grammar from which the tree can be reconstructed. In the second step, the context-free grammar is encoded into a binary codeword. The decoder of the grammar-based code decodes the original tree from its codeword by reversing the two encoding steps. It is shown that the resulting grammar-based binary tree compression code is a universal code on a family of probabilistic binary tree source models satisfying certain weak restrictions.