Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
184 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Simplified Tight Bounds for Monotone Minimal Perfect Hashing (2403.07760v2)

Published 12 Mar 2024 in cs.DS

Abstract: Given an increasing sequence of integers $x_1,\ldots,x_n$ from a universe ${0,\ldots,u-1}$, the monotone minimal perfect hash function (MMPHF) for this sequence is a data structure that answers the following rank queries: $rank(x) = i$ if $x = x_i$, for $i\in {1,\ldots,n}$, and $rank(x)$ is arbitrary otherwise. Assadi, Farach-Colton, and Kuszmaul recently presented at SODA'23 a proof of the lower bound $\Omega(n \min{\log\log\log u, \log n})$ for the bits of space required by MMPHF, provided $u \ge n 2{2{\sqrt{\log\log n}}}$, which is tight since there is a data structure for MMPHF that attains this space bound (and answers the queries in $O(\log u)$ time). In this paper, we close the remaining gap by proving that, for $u \ge (1+\epsilon)n$, where $\epsilon > 0$ is any constant, the tight lower bound is $\Omega(n \min{\log\log\log \frac{u}{n}, \log n})$, which is also attainable; we observe that, for all reasonable cases when $n < u < (1+\epsilon)n$, known facts imply tight bounds, which virtually settles the problem. Along the way we substantially simplify the proof of Assadi et al. replacing a part of their heavy combinatorial machinery by trivial observations. However, an important part of the proof still remains complicated. This part of our paper repeats arguments of Assadi et al. and is not novel. Nevertheless, we include it, for completeness, offering a somewhat different perspective on these arguments.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (17)
  1. Tight bounds for monotone minimal perfect hashing. In Proc. Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 456–476. SIAM, 2023. doi:10.1137/1.9781611977554.ch2.
  2. D. Belazzougui. Linear time construction of compressed text indices in compact space. In Proceedings of the forty-sixth Annual ACM Symposium on Theory of Computing, pages 148–193, 2014. doi:10.1145/2591796.2591885.
  3. Monotone minimal perfect hashing: searching a sorted table with O(1) accesses. In Proc. SODA, pages 785–794. SIAM, 2009. doi:10.1137/1.9781611973068.86.
  4. Hash, displace, and compress. In European Symposium on Algorithms, pages 682–693. Springer, 2009. doi:10.1007/978-3-642-04128-0_61.
  5. Linear-time string indexing and analysis in small space. ACM Transactions on Algorithms (TALG), 16(2):1–54, 2020. doi:10.1145/3381417.
  6. D. Belazzougui and G. Navarro. Alphabet-independent compressed text indexing. ACM Transactions on Algorithms (TALG), 10(4):1–19, 2014. doi:10.1145/2635816.
  7. D. Belazzougui and G. Navarro. Optimal lower and upper bounds for representing sequences. ACM Transactions on Algorithms (TALG), 11(4):1–21, 2015. doi:10.1145/2629339.
  8. D. Clark. Compact pat trees. PhD thesis, University of Waterloo, 1997.
  9. Dictionary matching in a stream. In Proc. ESA, volume 9294 of LNCS, pages 361–372. Springer, 2015. doi:10.1007/978-3-662-48350-3_31.
  10. Information theory and statistics. Elements of Information Theory, 1(1):279–335, 1991. doi:10.1002/0471200611.
  11. M. L. Fredman and J. Komlós. On the size of separating systems and families of perfect hash functions. SIAM Journal on Algebraic Discrete Methods, 5(1):61–68, 1984. doi:10.1137/0605009.
  12. Storing a sparse table with O(1) worst case access time. Journal of the ACM, 31(3):538–544, 1984. doi:10.1145/828.1884.
  13. Fully functional suffix trees and optimal text searching in bwt-runs bounded space. Journal of the ACM (JACM), 67(1):1–54, 2020. doi:10.1145/3375890.
  14. Optimal trade-offs for succinct string indexes. In Automata, Languages and Programming: 37th International Colloquium, ICALP 2010, Bordeaux, France, July 6-10, 2010, Proceedings, Part I 37, pages 678–689. Springer, 2010. doi:10.1007/978-3-642-14165-2_57.
  15. G. Jacobson. Space-efficient static trees and graphs. In Proc. 30th Annual Symposium on Foundations of Computer Science (FOCS), pages 549–554. IEEE, 1989. doi:10.1109/SFCS.1989.63533.
  16. K. Mehlhorn. On the program size of perfect and universal hash functions. In 23rd Annual Symposium on Foundations of Computer Science (SFCS 1982), pages 170–175. IEEE, 1982. doi:10.1109/SFCS.1982.80.
  17. J. Radhakrishnan. Improved bounds for covering complete uniform hypergraphs. Information Processing Letters, 41(4):203–207, 1992. doi:10.1016/0020-0190(92)90181-T.

Summary

We haven't generated a summary for this paper yet.