Optimal Bounds for Open Addressing Without Reordering (2501.02305v2)

Published 4 Jan 2025 in cs.DS and math.CO

Abstract: In this paper, we revisit one of the simplest problems in data structures: the task of inserting elements into an open-addressed hash table so that elements can later be retrieved with as few probes as possible. We show that, even without reordering elements over time, it is possible to construct a hash table that achieves far better expected search complexities (both amortized and worst-case) than were previously thought possible. Along the way, we disprove the central conjecture left by Yao in his seminal paper ``Uniform Hashing is Optimal''. All of our results come with matching lower bounds.

Summary

The paper presents new algorithms achieving optimal probe complexities for hash tables without reordering, including elastic and funnel hashing.
Elastic hashing achieves O(1) amortized and O(log δ⁻¹) worst-case probe complexities using geometrically structured sub-arrays and batch processing.
Funnel hashing refutes previous conjectures by ensuring O(log² δ⁻¹) expected worst-case probe performance through adaptive, multi-level organization.

Optimal Bounds for Open Addressing Without Reordering

This paper presents significant advancements in the domain of open-addressed hash tables by exploring insertion strategies that do not rely on reordering elements. It introduces novel algorithms that achieve optimal bounds on search complexities in such hash tables, disproving longstanding conjectures in the field. The paper highlights two innovative strategies: elastic hashing and funnel hashing, each providing unique insights into achieving efficient probe complexities.

Elastic Hashing

Elastic hashing is introduced as an algorithm that achieves $O(1)$ amortized expected probe complexity and $O(\log \delta^{-1})$ worst-case expected probe complexity without reordering. The algorithm structures the hash table into sub-arrays of geometrically decreasing sizes, allowing keys to be inserted with significantly fewer probes than traditional methods. The key innovation lies in breaking the conventional coupon-collector bottleneck by decoupling insertion probe complexity from search probe complexity.

Implementation Details

Elastic hashing employs a probe sequence stored in a two-dimensional fashion and mapped onto a one-dimensional array. This sequence ensures that more slots are assessed before settling on a position, thus reducing the search complexity for subsequent queries. An injection mapping ensures the efficient allocation of slots, while batch processing guarantees that the table remains only partially full, optimizing space and probe efficiency.

Performance Analysis

The algorithm proves optimal in amortized expected complexity through detailed bounds and integration over multiple batches. Each batch ensures that the slots are efficiently filled without reordering, leveraging non-greedy probing strategies essential to bypass existing lower bounds. The mathematical proofs underpinning this method demonstrate that even at high load factors, the hash table remains efficient in worst-case scenarios.

Funnel Hashing

Funnel hashing is another strategy that targets greedy open-addressed hash tables, achieving an optimal worst-case expected probe complexity of $O(\log^2 \delta^{-1})$ . This strategy invalidates Yao's conjecture about uniform probing's optimality, demonstrating that even within greedy algorithms, more efficient probing can be achieved.

Implementation Details

Funnel hashing divides the hash table into multiple levels, each supporting a distinct size, which is polynomially adjusted to maintain the overall efficiency across queries. The approach balances between amortized and worst-case measures by providing varying guarantees across load factors. The hash table adheres strictly to the no-reordering constraint while maintaining optimal performance through adaptive probing strategies.

Performance Analysis

Funnel hashing's efficiency comes from breaking down the entries into smaller levels, ensuring each probe sequence resolves to the optimal slot choice efficiently. The analysis across levels confirms that the historical conjectures about uniform probing optimality are indeed incorrect across greedy strategies. Theorems presented in the paper offer boundaries and distributions for assessing performance in both practical and theoretical settings.

Lower Bounds and Implications

The paper provides robust lower bounds for hash tables that do not reorder entries, establishing that any efficient open-addressing method should have complexity bounds that coincide with the proposed algorithms' measures. The high-probability worst-case probe considerations offer insights into how key distribution should impact probing strategies in hash tables devoid of reordering.

Conclusion

The research presents substantial advancements in hash table performance, disproving earlier conjectures and establishing new bounds for insertion and querying complexities. By examining non-reordering open-addressed strategies through elastic and funnel hashing, the paper provides a seminal reference point for future developments in data structures demanding optimal performance. The breakthrough of achieving $O(1)$ amortized complexity without reordering outlines new classes of probe-efficient algorithms that challenge existing paradigms in hash table design and application.