Grammar-Compressed Indexes with Logarithmic Search Time
Abstract: Let a text $T[1..n]$ be the only string generated by a context-free grammar with $g$ (terminal and nonterminal) symbols, and of size $G$ (measured as the sum of the lengths of the right-hand sides of the rules). Such a grammar, called a grammar-compressed representation of $T$, can be encoded using essentially $G\lg g$ bits. We introduce the first grammar-compressed index that uses $O(G\lg n)$ bits and can find the $occ$ occurrences of patterns $P[1..m]$ in time $O((m2+occ)\lg G)$. We implement the index and demonstrate its practicality in comparison with the state of the art, on highly repetitive text collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.