Papers
Topics
Authors
Recent
2000 character limit reached

On Locating Paths in Compressed Tries

Published 2 Apr 2020 in cs.DS | (2004.01120v4)

Abstract: In this paper, we consider the problem of compressing a trie while supporting the powerful \emph{locate} queries: to return the pre-order identifiers of all nodes reached by a path labeled with a given query pattern. Our result builds on top of the XBWT tree transform of Ferragina et al. [FOCS 2005] and generalizes the \emph{r-index} locate machinery of Gagie et al. [SODA 2018, JACM 2020] based on the run-length encoded Burrows-Wheeler transform (BWT). Our first contribution is to propose a suitable generalization of the run-length BWT to tries. We show that this natural generalization enjoys several of the useful properties of its counterpart on strings: in particular, the transform natively supports counting occurrences of a query pattern on the trie's paths and its size $r$ captures the trie's repetitiveness and lower-bounds a natural notion of trie entropy. Our main contribution is a much deeper insight into the combinatorial structure of this object. In detail, we show that a data structure of $O(r\log n) + 2n + o(n)$ bits, where $n$ is the number of nodes, allows locating the $occ$ occurrences of a pattern of length $m$ in nearly-optimal $O(m\log\sigma + occ)$ time, where $\sigma$ is the alphabet's size. Our solution consists in sampling $O(r)$ nodes that can be used as "anchor points" during the locate process. Once obtained the pre-order identifier of the first pattern occurrence (in co-lexicographic order), we show that a constant number of constant-time jumps between those anchor points lead to the identifier of the next pattern occurrence, thus enabling locating in optimal $O(1)$ time per occurrence.

Citations (2)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (1)

Collections

Sign up for free to add this paper to one or more collections.