Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Space-Efficient Re-Pair Compression (1611.01479v1)

Published 4 Nov 2016 in cs.DS

Abstract: Re-Pair is an effective grammar-based compression scheme achieving strong compression rates in practice. Let $n$, $\sigma$, and $d$ be the text length, alphabet size, and dictionary size of the final grammar, respectively. In their original paper, the authors show how to compute the Re-Pair grammar in expected linear time and $5n + 4\sigma2 + 4d + \sqrt{n}$ words of working space on top of the text. In this work, we propose two algorithms improving on the space of their original solution. Our model assumes a memory word of $\lceil\log_2 n\rceil$ bits and a re-writable input text composed by $n$ such words. Our first algorithm runs in expected $\mathcal O(n/\epsilon)$ time and uses $(1+\epsilon)n +\sqrt n$ words of space on top of the text for any parameter $0<\epsilon \leq 1$ chosen in advance. Our second algorithm runs in expected $\mathcal O(n\log n)$ time and improves the space to $n +\sqrt n$ words.

Citations (22)

Summary

We haven't generated a summary for this paper yet.