Papers
Topics
Authors
Recent
2000 character limit reached

Hierarchical Navigable Small World (HNSW)

Updated 14 December 2025
  • HNSW is a graph-based, hierarchical structure defined by layered small-world graphs that enable efficient approximate nearest neighbor search.
  • It employs beam search with greedy descent and RNG-based neighbor selection to maintain robust connectivity and high recall with sub-linear query complexity.
  • Recent optimizations, including LID-based layer assignment and distributed merging, improve update operations and scalability in dynamic, large-scale environments.

Hierarchical Navigable Small World (HNSW) is a graph-based indexing structure designed for efficient, scalable approximate nearest neighbor search (ANNS), widely adopted in both research and industrial retrieval systems. HNSW leverages a hierarchy of sparse small-world graphs and employs greedy, layered search algorithms to achieve sub-linear query complexity and high recall, particularly in high-dimensional vector spaces. Its architecture and algorithms have inspired a substantial lineage of extensions, optimizations, and practical deployments across billions of data points.

1. Layered Small-World Graph Structure and Construction

HNSW maintains a multi-layer hierarchy G0,G1,...,GLmaxG_0, G_1, ..., G_{L_{max}} where GG_\ell is a proximity graph over a geometrically sampled subset of the data. For each point pp, the maximum layer (p)\ell(p) is drawn from a geometric distribution: Pr[(p)=l]=peζl\Pr[\ell(p) = l] = p \cdot e^{-\zeta l} where p1/eζp \approx 1/e^\zeta in practice (Xiao et al., 10 Jul 2024). Layer 0 contains all nn data points, with higher layers containing progressively sparser subsets, forming a skip-list-like hierarchy.

Index construction proceeds as follows:

  • A new point pp is assigned a random (p)\ell(p).
  • Starting from the global entry point at LmaxL_{max}, greedy routing is performed down to (p)+1\ell(p)+1 to locate appropriate insertion points.
  • At each layer \ell from (p)\ell(p) down to 0, a beam search (best-first search with candidate list size efconstructionef_{construction}, typically 100–200) collects WW candidate neighbors. The selection heuristic (approximate relative-neighborhood graph) prunes these to at most MM bidirectional connections, maintaining connectivity and diversity (Malkov et al., 2016).

The established neighbor-selection rule is critical for robust performance in clustered or low-dimensional data (Malkov et al., 2016, Elliott et al., 28 May 2024).

2. Hierarchical Search Algorithm and Computational Properties

A kk-NN query qq initiates a greedy descent from the entry point in the highest layer, locally moving to the closest neighbor at each level:

  • For \ell descending from LmaxL_{max} to 1, epargminxepneighbors  d(q,x)ep \leftarrow \operatorname{argmin}_{x\in ep-\text{neighbors}}\; d(q, x)
  • At layer 0, a best-first (beam) search is executed with candidate pool size efef; the closest kk neighbors are returned.

Typical metric is Euclidean, d(q,x)=qx2d(q,x) = \|q - x\|_2 (Xiao et al., 10 Jul 2024). The candidate pool size efef (exploration factor) directly governs recall and query cost; larger efef increases recall, but incurs higher latency and memory usage.

Empirically and theoretically, HNSW search runtime is sub-linear: Average-case query cost:O(logn)\text{Average-case query cost:}\quad O(\log n) and often observed as O(n1/2)O(n^{1/2}) in complex or high-dimensional settings (Xiao et al., 10 Jul 2024, Elliott et al., 28 May 2024). Insertion cost is

O((p)(efconstructionlogefconstruction+MlogM))O(\ell(p) \cdot (ef_{construction} \log ef_{construction} + M \log M))

with space complexity O(nM)O(n\cdot M) edges (Xiao et al., 10 Jul 2024).

3. Impact of Data Properties, Insertion Order, and Intrinsic Dimensionality

Search quality and recall in HNSW are sensitive to data intrinsic dimensionality and insertion sequence. The pointwise Local Intrinsic Dimensionality (LID): LID(x)=(1k1i=1k1lndk(x)di(x))1\mathrm{LID}(x) = \left( \frac{1}{k-1} \sum_{i=1}^{k-1} \ln\frac{d_k(x)}{d_i(x)} \right)^{-1} quantifies the local growth rate of the neighborhood volume. High-LID points are harder to connect and traverse, affecting final recall (Elliott et al., 28 May 2024).

Ordered insertion based on descending LID improves recall systematically (up to 12 percentage points in benchmarks) compared to ascending or chronological order. Category-aware or buffer-interleaved insertion can further mitigate recall degradation in real-world batch updates. High global intrinsic dimensionality leads to recall collapses—this must be offset by increasing MM, efconstructionef_{construction}, and efsearchef_{search} as needed (Elliott et al., 28 May 2024).

4. Update Operations and Unreachable Points Phenomenon

Frequent updates—deletions, insertions, overwrites—trigger the unreachable points phenomenon: points become isolated in all layers (  :  degreein,G(v)=0\forall\;\ell:\;degree_{in,G_\ell}(v) = 0), rendering them inaccessible to search.

Standard "replaced_update" operation involves:

  • Marking a node as deleted.
  • Collecting one-hop and two-hop neighbor candidates.
  • Pruning each candidate's neighbor list.
  • Writing new data over the deleted node.

This process may inadvertently disconnect nodes if their only incoming edge is lost, resulting in persistent unreachable subgraphs. On SIFT (n=1n=1M), iteratively killing & reinserting 5% for 3,000 rounds grows the unreachable fraction to 3–4%, with recall@1 dropping ≈3% (Xiao et al., 10 Jul 2024). Increasing efef during search does not recover lost accuracy.

The MN-RU algorithm mitigates the phenomenon by restricting reconnections to only one-hop neighbors that lose links, using an α\alpha-RNG pruning with complexity O(M2)O(M^2), markedly faster than previous O(M3)O(M^3) two-hop repairs (Xiao et al., 10 Jul 2024).

  • Update efficiency is improved by $2$–4×4\times over HNSW-RU.
  • Unreachable-point growth is suppressed (<0.2%<0.2\% after 200 updates).
  • Recall is kept within $0.5$–1%1\% of a freshly built index even after hundreds of updates.

A backup index on unreachable points ("dualSearch") can restore overall recall with minimal overhead.

5. Algorithmic Extensions and Optimizations

Recent extensions have focused on mitigating local optima, cluster disconnections, and costly construction in high-dimensional or clustered data:

  • Dual-Branch HNSW ("HNSW++"): splits the index into two parallel branches, each with its own entry point, converging at layer 0. Nodes are assigned to branches and layers using normalized LID, with high-LID nodes prioritized for upper layers—this improves cluster connectivity and recall, reducing cluster disconnections (Nguyen et al., 23 Jan 2025). Search is performed in both branches, merging the candidate sets for final results.
  • Skip-Bridges: allow the search to bypass intermediate sparse layers when a node's LID and proximity to the query exceed given thresholds. This further reduces effective search depth and accelerates recall attainment.
  • LID-based Layer Assignment: stratifies insertion order and connectivity by local dimensionality, yielding recall improvements of +18% (NLP) and +30% (CV) over baseline HNSW in datasets (Nguyen et al., 23 Jan 2025).

Construction time is reduced by up to 20% over standard HNSW; empirical ablation studies demonstrate the centrality of LID-based assignment to these gains.

6. Distributed, Merging, and Disaggregated Memory Architectures

Scalable deployments of HNSW across memory boundaries (e.g., disaggregated memory in data centers) require maintaining the global graph structure:

  • SHINE stores the entire HNSW index across memory nodes (MNs), preserving all edges via remote pointers, accessed by compute nodes (CNs) over RDMA. No graph partitioning occurs, so recall matches single-machine HNSW (Widmoser et al., 23 Jul 2025).
  • Efficient caching mechanisms are required—only vectors (not neighbor lists) are cached per CN. Logical index partitioning via kk-means clusters ensures that cache-segmentation penalties are minimized, and adaptive query routing (via an oracle and load balancing) allows up to $2$–3×3\times higher query throughput than with naive cache-only or no-cache strategies.
  • Distributed HNSW merging involves efficiently combining separately built HNSW graphs. Algorithms such as Naive Graph Merge (NGM), Intra Graph Traversal Merge (IGTM), and Cross Graph Traversal Merge (CGTM) operate via iterative vertex selection, candidate neighbor collection, neighborhood construction (via RNG/k-NN filtering), and information propagation. IGTM achieves 70%\sim70\% fewer distance computations than naive methods at the same recall, supporting incremental, compaction, and distributed indexing scenarios (Ponomarenko, 21 May 2025).

7. Practical Tuning, Adaptive Exploration, and Benchmarking

Operational best practices emphasize parameter selection:

  • M=16M=16 is widely used; increasing MM (e.g., to $32$) marginally increases recall (<1%) but slows construction and increases space.
  • efconstruction=100ef_{construction}=100–$200$ balances build time and recall.
  • efsearchef_{search} should be tuned for latency/recall trade-off; $200$ suffices for ≈90% recall, $1000$ for >99% (Lin, 10 Sep 2024).

Adaptive-ef (Ada-ef) introduces statistically principled data- and query-driven determination of efef: fitting the Full Distance List (FDL) between the query and database to a normal distribution allows theoretical estimation of required efef to meet desired recall RR (Zhang et al., 7 Dec 2025). The algorithm dynamically scores queries and assigns minimal efef for each, producing up to 4×4\times lower latency and robust recall guarantees, even under distribution shift, with drastically reduced offline computation and memory compared to learning-based adaptive strategies.

For collections under $100$K, brute-force (flat) search may be preferable for rapid prototyping; above $1$M, HNSW becomes the dominant paradigm, yielding $10$–100×100\times query speedup over flat methods with negligible nDCG loss. Quantization (int8) is low risk for speed increase; the nDCG@10 loss is <0.02<0.02 (Lin, 10 Sep 2024).

References

Whiteboard

Follow Topic

Get notified by email when new papers are published related to Hierarchical Navigable Small World (HNSW).