Towards Better Compressed Representations
Abstract: We introduce the problem of computing a parsing where each phrase is of length at most $m$ and which minimizes the zeroth order entropy of parsing. Based on the recent theoretical results we devise a heuristic for this problem. The solution has straightforward application in succinct text representations and gives practical improvements. Moreover the proposed heuristic yields structure whose size can be bounded both by $|S|H_{m-1}(S)$ and by $|S|/m(H_0(S) + \cdots + H_{m-1})$, where $H_{k}(S)$ is the $k$-th order empirical entropy of $S$. We also consider a similar problem in which the first-order entropy is minimized.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.