Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash 99 tok/s
Gemini 2.5 Pro 54 tok/s Pro
GPT-5 Medium 37 tok/s
GPT-5 High 38 tok/s Pro
GPT-4o 111 tok/s
GPT OSS 120B 470 tok/s Pro
Kimi K2 243 tok/s Pro
2000 character limit reached

Two-way Linear Probing Revisited (2309.05308v2)

Published 11 Sep 2023 in cs.DS

Abstract: We introduce linear probing hashing schemes that construct a hash table of size $n$, with constant load factor $\alpha$, on which the worst-case unsuccessful search time is asymptotically almost surely $O(\log \log n)$. The schemes employ two linear probe sequences to find empty cells for the keys. Matching lower bounds on the maximum cluster size produced by any algorithm that uses two linear probe sequences are obtained as well.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (89)
  1. M. Adler, P. Berenbrink, and K. Schroeder, “Analyzing an infinite parallel job allocation process,” in: Proceedings of the European Symposium on Algorithms, pp.417–428, 1998.
  2. M. Adler, S. Chakrabarti, M. Mitzenmacher, and L. Rasmussen, “Parallel randomized load balancing,” in: Proceedings of the 27th Annual ACM Symposium on Theory of Computing (STOC), pp. 238–247, 1995.
  3. D. Aldous, “Hashing with linear probing, under non-uniform probabilities,” Probab. Eng. Inform. Sci., vol. 2, pp. 1–14, 1988.
  4. O. Aichholzer, F. Aurenhammer, and G. Rote, “Optimal graph orientation with storage applications”, SFB-Report F003-51, SFB ’Optimierung und Kontrolle’, TU-Graz, Austria, 1995.
  5. Y. Azar, A. Z. Broder, A. R. Karlin and E. Upfal, “Balanced allocations,” SIAM Journal on Computing, vol. 29:1, pp. 180–200, 2000.
  6. P. Berenbrink, A. Czumaj, T. Friedetzky, and N. D. Vvedenskaya, “Infinite parallel job allocations,” in: Proceedings of the 11th Annual ACM Symposium on Parallel Algorithms and Architectures (SPAA), pp. 99–108, 2000.
  7. P. Berenbrink, A. Czumaj, A. Steger, and B. Vöcking, “Balanced allocations: the heavily loaded case,” SIAM Journal on Computing, vol. 35 (6), pp. 1350–1385, 2006.
  8. B. Bollobás, A. Z. Broder, and I. Simon, “The cost distribution of clustering in random probing,” Journal of the ACM, vol. 37 (2), pp. 224–237, 1990.
  9. R. P. Brent, “Reducing the retrieval time of scatter storage techniques,” Communications of the ACM, vol. 16 (2), pp. 105–109, 1973.
  10. A. Z. Broder and A. Karlin, “Multilevel adaptive hashing,” in: Proceedings of the 1st Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), ACM Press, pp. 43–53, 2000.
  11. A. Broder and M. Mitzenmacher, “Using multiple hash functions to improve IP lookups,” in: Proceedings of 20th Annual Joint Conference of the IEEE Computer and Communications Societies (INFOCOM 2001), pp. 1454–1463, 2001. Full version available as Technical Report TR–03–00, Department of Computer Science, Harvard University, Cambridge, MA, 2000.
  12. P. Celis, “Robin Hood hashing,” Ph.D. thesis, Computer Science Department, University of Waterloo, 1986. Available also as Technical Report CS-86-14.
  13. P. Celis, P. Larson, and J. I. Munro, “Robin Hood hashing (preliminary report),” in: Proceedings of the 26th Annual IEEE Symposium on Foundations of Computer Science (FOCS), pp. 281–288, 1985.
  14. P. Chassaing and G. Louchard, “Phase transition for parking blocks, Brownian excursion and coalescence,” Random Structures Algorithms, vol. 21 (1), pp. 76–119, 2002.
  15. R. Cole, A. Frieze, B. M. Maggs, M. Mitzenmacher, A. W. Richa, R. K. Sitaraman, and E. Upfal, “On balls and bins with deletions,” in: Proceedings of the 2nd International Workshop on Randomization and Approximation Techniques in Computer Science, LNCS 1518, Springer-Verlag, pp. 145–158, 1998.
  16. R. Cole, B. M. Maggs, F. Meyer auf der Heide, M. Mitzenmacher, A. W. Richa, K. Schroeder, R. K. Sitaraman, and B. Voecking, “Randomized protocols for low-congestion circuit routing in multistage interconnection networks,” in: Proceedings of the 29th Annual ACM Symposium on the Theory of Computing (STOC), pp. 378–388, 1998.
  17. A. Czumaj and V. Stemann, “Randomized Allocation Processes,” Random Structures and Algorithms, Vol. 18 (4), pp. 297–331, 2001.
  18. K. Dalal, L. Devroye. E. Malalla, and E. McLeish, “Two-way chaining with reassignment,” SIAM Journal on Computing, Vol. 35 (2), pp 327–340, 2005.
  19. L. Devroye, “The expected length of the longest probe sequence for bucket searching when the distribution is not uniform,” Journal of Algorithms, vol. 6, pp. 1–9, 1985.
  20. L. Devroye and P. Morin, “Cuckoo hashing: further analysis,” Information Processing Letters, vol. 86, pp. 215-219, 2003.
  21. L. Devroye, P. Morin, and A. Viola, “On worst-case Robin Hood hashing,” SIAM Journal on Computing, vol. 33, pp. 923–936, 2004.
  22. M. Dietzfelbinger and F. Meyer auf der Heide, “A new universal class of hash functions and dynamic hashing in real time,” in: Proceedings of the 17th International Colloquium on Automata, Languages and Programming, LNCS 443, Springer-Verlag, pp. 6–19, 1990.
  23. M. Dietzfelbinger and F. Meyer auf der Heide, “High performance universal hashing, with applications to shared memory simulations,” in: Data Structures and Efficient Algorithms, LNCS 594, Springer-Verlag, pp. 250–269, 1992.
  24. M. Dietzfelbinger and C. Weidling, “Balanced allocation and dictionaries with tightly packed constant size bins,” Theoretical Computer Science, vol. 380, pp. 47–68, 2007.
  25. M. Dietzfelbinger and P. Wolfel, “Almost random graphs with simple hash functions,” in: Proceedings of the 35th Annual ACM Symposium on Theory of Computing (STOC), pp. 629–638, 2003.
  26. M. Dietzfelbinger, J. Gil, Y. Matias, and N. Pippenger, “Polynomial hash functions are reliable (extended abstract),” in: Proceedings of the 19th International Colloquium on Automata, Languages and Programming, LNCS 623, Springer-Verlag, pp. 235–246, 1992.
  27. M. Dietzfelbinger, A. Karlin, K. Mehlhorn, F. Meyer auf der Heide, H. Rohnert, and R. Tarjan, “Dynamic perfect hashing: upper and lower bounds,” SIAM Journal on Computing, vol. 23 (4), pp. 738–761, 1994.
  28. D. Dubhashi, and D. Ranjan, “Balls and bins: a study in negative dependence,” Random Structures and Algorithms, vol. 13 (2), pp. 99–124, 1998.
  29. D. L. Eager, E. D. Lazowska, and J. Zahorjan, “Adaptive load sharing in homogeneous distributed systems,” IEEE Transactions on Software Engineering, vol. 12, pp. 662–675, 1986.
  30. J. D. Esary, F. Proschan, and D. W. Walkup, “Association of random variables, with applications,” Annals of Mathematical Statistics, vol. 38, pp. 1466–1474, 1967.
  31. P. Flajolet, P. V. Poblete, and A. Viola, “On the analysis of linear probing hashing,” Algorithmica, vol. 22, pp. 490–515, 1998.
  32. D. Fotakis, R. Pagh, P. Sanders, and P. G. Spirakis, “Space efficient hash tables with worst case constant access time,” in: Proceedings of the 20th Symposium on Theoretical Aspects of Computer Science, LNCS 2607, Springer-Verlag, pp. 271–282, 2003.
  33. N. Fountoulakis and K. Panagiotou, “Sharp Load Thresholds for Cuckoo Hashing,” Random Structures and Algorithms, vol. 41 (3), pp. 306–333, 2012.
  34. N. Fountoulakis, K. Panagiotou, and A. Steger, “On the Insertion Time of Cuckoo Hashing,” SIAM Journal on Computing, vol. 42, pp. 2156–2181, 2013.
  35. M. Fredman, J. Komlós, and E. Szemerédi, “Storing a sparse table with O⁢(1)𝑂1O(1)italic_O ( 1 ) worst case access time,” Journal of the ACM, vol. 31, pp. 538–544, 1984.
  36. A. M. Frieze and P. Melsted, “Maximum Matchings in Random Bipartite Graphs and the Space Utilization of Cuckoo Hash Tables,” Random Structures and Algorithms, vol. 41 (3), pp. 334–364, 2012.
  37. A. M. Frieze, P. Melsted, and M. Mitzenmacher, “An analysis of random-walk cuckoo hashing,” SIAM Journal on Computing, vol. 40 (2), pp. 291–308, 2011.
  38. G. H. Gonnet, “Open addressing hashing with unequal-probability keys,” Journal of Computer and System Sciences, vol. 20, pp. 354–367, 1980.
  39. G. H. Gonnet, “Expected length of the longest probe sequence in hash code searching,” Journal of the ACM, vol. 28, pp. 289–304, 1981.
  40. L. J. Guibas, “The analysis of hashing techniques that exhibit K𝐾Kitalic_K-ary clustering,” Journal of the ACM, vol. 25 (4), pp. 544–555, 1978.
  41. S. Janson, “Asymptotic distribution for the cost of linear probing hashing,” Random Structures and Algorithms, vol. 19 (3–4), pp. 438–471, 2001.
  42. S. Janson, “Individual displacements for linear probing hashing with different insertion policies,” Technical Report No. 35, Department of Mathematics, Uppsala University, 2003.
  43. S. Janson and A. Viola, “A unified approach to linear probing hashing with buckets,” Algorithmica, vol. 75 (4), pp.724–781, 2016.
  44. K. Joag-Dev and F. Proschan, “Negative association of random variables, with applications”, Annals of Statistics, vol. 11 (4), pp. 286–295, 1983.
  45. R. Karp, M. Luby, and F. Meyer auf der Heide, “Efficient PRAM simulation on a distributed memory machine,” Algorithmica, vol. 16, pp. 245–281, 1996.
  46. D. E. Knuth, “Notes on “open” addressing,” Unpublished notes, 1963. Available at http://www.wits.ac.za/helmut/first.ps.
  47. D. E. Knuth, “Linear probing and graphs, average-case analysis for algorithms,” Algorithmica, vol. 22 (4), pp. 561–568, 1998.
  48. A. G. Konheim and B. Weiss, “An occupancy discipline and applications,” SIAM Journal on Applied Mathematics, vol. 14, pp. 1266–1274, 1966.
  49. P. Larson, “Analysis of uniform hashing,” Journal of the ACM, vol. 30 (4), pp. 805–819, 1983.
  50. E. Lehman and R. Panigrahy, “3.5-Way Cuckoo Hashing for the Price of 2-and-a-Bit,” in: Proceedings of the 17th Annual European Symposium, pp. 671–681, 2009.
  51. G. S. Lueker and M. Molodowitch, “More analysis of double hashing,” Combinatorica, vol. 13 (1), pp. 83–96, 1993.
  52. J. A. T. Madison, “Fast lookup in hash tables with direct rehashing,” The Computer Journal, vol. 23 (2), pp. 188–189, 1980.
  53. E. G. Mallach, “Scatter storage techniques: a uniform viewpoint and a method for reducing retrieval times,” The Computer Journal, vol. 20 (2), pp. 137–140, 1977.
  54. H. Mendelson and U. Yechiali, “A new approach to the analysis of linear probing schemes,” Journal of the ACM, vol. 27 (3), pp. 474–483, 1980.
  55. F. Meyer auf der Heide, C. Scheideler, and V. Stemann, “Exploiting storage redundancy to speed up randomized shared memory simulations,” in: Theoretical Computer Science, Series A, Vol. 162 (2), pp. 245–281, 1996.
  56. M. Mitzenmacher, “Studying balanced allocations with differential equations,” Combinatorics, Probability, and Computing, vol. 8, pp. 473–482, 1999.
  57. M. Mitzenmacher and B. Vöcking, “The asymptotics of Selecting the shortest of two, improved,” in: Proceedings of the 37th Annual Allerton Conference on Communication, Control, and Computing, pp. 326–327, 1998.
  58. M. D. Mitzenmacher, A. Richa, and R. Sitaraman, “The power of two random choices: A survey of the techniques and results,” in: Handbook of Randomized Computing, (P. Pardalos, S. Rajasekaran, and J. Rolim, eds.), pp. 255–305, 2000.
  59. R. Morris, “Scatter storage techniques,” Communications of the ACM, vol. 11 (1), pp. 38–44, 1968.
  60. J. I. Munro and P. Celis, “Techniques for collision resolution in hash tables with open addressing,” in: Proceedings of 1986 Fall Joint Computer Conference, pp. 601–610, 1999.
  61. M. Okamoto, “Some inequalities relating to the partial sum of binomial probabilities,” Annals of Mathematical Statistics, vol. 10, pp. 29–35, 1958.
  62. A. Östlin and R. Pagh, “Uniform hashing in constant time and linear space,” in: Proceedings of the 35th Annual ACM Symposium on Theory of Computing (STOC), pp. 622–628, 2003.
  63. R. Pagh, “Hash and displace: Efficient evaluation of minimal perfect hash functions,” in: Proceedings of the 6th International Workshop on Algorithms and Data Structures, LNCS 1663, Springer-Verlag, pp. 49–54, 1999.
  64. R. Pagh, “On the cell probe complexity of membership and perfect hashing,” in: Proceedings of 33rd Annual ACM Symposium on Theory of Computing (STOC), pp. 425–432, 2001.
  65. A. Pagh, R. Pagh, and M. Ružić, “Linear probing with 5-wise independence,” SIAM Review, vol. 53 (3), pp. 547–558, 2011.
  66. R. Pagh and F. F. Rodler, “Cuckoo hashing,” in: Proceedings of the European Symposium on Algorithms, LNCS 2161, Springer-Verlag, pp. 121–133, 2001.
  67. W. W. Peterson, “Addressing for random-access storage,” IBM Journal of Research and Development, vol. 1 (2), pp. 130–146, 1957.
  68. G. C. Pflug and H. W. Kessler, “Linear probing with a nonuniform address distribution,” Journal of the ACM, vol. 34 (2), pp. 397–410, 1987.
  69. B. Pittel, “Linear probing: The probable largest search time grows logarithmically with the number of records,” Journal of Algorithms, vol. 8, pp. 236–249, 1987.
  70. P. V. Poblete and J. I. Munro, “Last-Come-First-Served hashing,” Journal of Algorithms, vol. 10, pp. 228–248, 1989.
  71. P. V. Poblete, A. Viola, and J. I. Munro, “Analyzing the LCFS linear probing hashing algorithm with the help of Maple,” Maple Technical Newletter vol. 4 (1), pp. 8–13, 1997.
  72. M. Raab and A. Steger, ““Balls and bins” – a simple and tight analysis,” in: Proceedings of the 2nd Workshop on Randomization and Approximation Techniques in Computer Science, LNCS 1518, Springer-Verlag, pp. 159–170, 1998.
  73. S. Richter, V. Alvarez, and J. Dittrich, “A seven-dimensional analysis of hashing methods and its implications on query processing,” in: Porceedings of the VLDB Endowment, vol. 9 (3), pp. 96–107, 2015.
  74. R. L. Rivest, “Optimal arrangement of keys in a hash table,” Journal of the ACM, vol. 25 (2), pp. 200–209, 1978.
  75. T. Schickinger and A. Steger, “Simplified witness tree arguments,” in: Proceedings of the 27th Annual Conference on Current Trends in Theory and Practice of Informatics, LNCS 1963, Springer-Verlag, pp. 71–87, 2000.
  76. J. P. Schmidt and A. Siegel, “Double hashing is computable and randomizable with universal hash functions,” submitted. A full version is available as Technical Report TR1995-686, Computer Science Department, New York University, 1995.
  77. A. Siegel, “On universal classes of extremely random constant time hash functions and their time-space tradeoff,” Technical Report TR1995-684, Computer Science Department, New York University, 1995. A previous version appeared under the title “On universal classes of fast high performance hash functions, their time-space tradeoff and their applications,” in: Proceedings of the 30th Annual IEEE Symposium on Foundations of Computer Science (FOCS), pp. 20–25, 1989.
  78. A. Siegel and J. P. Schmidt, “Closed hashing is computable and optimally randomizable with universal hash functions,” submitted. A full version is available as Technical Report TR1995-687, Computer Science Department, New York University, 1995.
  79. V. Stemann, “Parallel balanced allocations”, in: Proceedings of the 8th Annual ACM Symposium on Parallel Algorithms and Architectures (SPAA), pp. 261–269, 1996.
  80. M. Thorup and Y. Zhang, “Tabulation-based 5-independent hashing with applications to linear probing and second moment estimation,” SIAM Journal on Computing, vol. 41 (2), pp. 293–331, 2012.
  81. J. D. Ullman, “A note on the efficiency of hashing functions,” Journal of the ACM, vol. 19 (3), pp. 569–575, 1972.
  82. A. Viola, “Exact distributions of individual displacements in linear probing hashing,” ACM Transactions on Algorithms, vol. 1 (2), pp. 214–242, 2005.
  83. A. Viola and P. V. Poblete, “The analysis of linear probing hashing with buckets”, Algorithmica, vol. 21, pp. 37–71, 1998.
  84. J. S. Vitter and P. Flajolet, “Average-case analysis of algorithms and data structures,” in: Handbook of Theoretical Computer Science, Volume A: Algorithms and Complexity, ed. J. van Leeuwen, pp. 431–524, MIT Press, Amsterdam, 1990.
  85. B. Vöcking, “How asymmetry helps load balancing,” Journal of the ACM, vol. 50 (4), pp. 568–589, 2003.
  86. B. Vöcking, “Symmetric vs. asymmetric multiple-choice algorithms,” in: Proceedings of the 2nd ARACNE Workshop, Aarhus, pp. 7–15, 2001.
  87. S. Walzer, “Load thresholds for cuckoo hashing with overlapping blocks,” in: Proceedings of the 45th International Colloquium on Automata, Languages, and Programming, pp. 102:1–102:10, 2018.
  88. J. Wu and L. Kobbelt, “Fast mesh decimation by multiple-choice techniques,” in: Proceedings of Vision, Modeling, and Visualization, pp. 241–248, 2002.
  89. A. C. Yao, “Uniform hashing is optimal,” Journal of the ACM, vol. 32 (3), pp. 687–693, 1985.
Citations (1)
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Ai Generate Text Spark Streamline Icon: https://streamlinehq.com

Paper Prompts

Sign up for free to create and run paper prompts using GPT-5.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube