Papers
Topics
Authors
Recent
Search
2000 character limit reached

Dynamic Approximate Nearest Neighbours

Updated 15 April 2026
  • Dynamic ANN is a class of data structures and algorithms that support efficient approximate nearest neighbor queries under continuous insertions, deletions, and modifications.
  • These methods integrate graph-based, partition-based, and continuous indexing strategies to balance recall, query latency, and update efficiency, with systems like HNSW and IVFPQ as prominent examples.
  • Empirical benchmarks reveal that optimizing ingestion cost and memory reclamation is key to maintaining high recall and responsiveness in streaming, high-dimensional search applications.

Dynamic Approximate Nearest Neighbours (Dynamic ANN) is the class of data structures and algorithms designed to support efficient approximate nearest neighbor (ANN) queries under the full spectrum of dynamic operations—continuous insertions, deletions, and modifications—thereby enabling responsiveness and adaptability in large-scale, evolving vector databases and high-dimensional search systems. In contrast to traditional static ANN, dynamic ANN explicitly addresses update latency and index freshness under real-time and streaming data ingestion, a necessity in contemporary AI/ML applications such as retrieval-augmented generation and live recommendation systems (Zeng et al., 2024).

1. Fundamental Principles and Problem Formulation

Dynamic ANN seeks a trade-off between query accuracy (usually recall@K), query throughput/latency, and update efficiency. Given a data set DD in a metric or vector space and a fast-changing workload of insertions, deletions, and queries, the goal is to support, at each time tt, queries of the form: "find a vector xDx\in D (possibly excluding recently deleted points) such that xx is among the KK approximate nearest neighbors to a query qq", with guarantees on approximation ratio or empirical recall (Harwood et al., 2024). Dynamicity is operationalized via:

  • Insertion: Online addition of new points, typically at high/variable ingestion rates.
  • Deletion: Removal or logical invalidation of points, with increasing focus on true memory reclamation and maintaining connectivity (Mishra et al., 19 Dec 2025).
  • Modification: Defined as a composite delete + insert.

Dynamic ANN workload models (e.g., CANDY) benchmark systems under realistic ingestion streams (e.g., fixed or bursty arrival at λ=4000\lambda=4000 vectors/s), admission control (buffer overflow/data dropping), and micro-batching (batch size BB parameterizes freshness-latency trade-off) (Zeng et al., 2024).

2. Main Algorithmic Frameworks

Dynamic ANN methods decompose primarily into:

  • Ranging/Partition-based: Locality-sensitive hashing (LSH), partition trees, and vector quantization (PQ/IVFPQ/ScaNN). These methods partition the metric space into buckets or clusters for fast candidate pruning. Incremental adaptation is typically possible only for the data-independent (hash-based) case; product quantization and clustering-based schemes often require costly re-training or batch amortization for centroids/codebooks (Aden-Ali et al., 20 Dec 2025, Harwood et al., 2024).
  • Graph-based/Navigation-based: Proximity and small-world graphs (e.g., HNSW, NSW, DEG). These structures incrementally link new data via local search or edge swaps, maintaining graph connectivity and navigability for queries (Hezel et al., 2023, Mishra et al., 19 Dec 2025). Graph-based methods dominate at high recall and enable native O(logN)O(\log N) incremental insertion, but true deletion is challenging: naive logical deletion ("tombstones") is memory-intensive and degrades search, while structural patching (e.g., SPatch) can reclaim memory and preserve search efficiency (Mishra et al., 19 Dec 2025).

Additional paradigms include continuous indexes (random projections; DCI) that avoid partitioning and probabilistically guarantee sublinear query/update complexity (Li et al., 2015), geometric data structures (polygonal or hyperbolic domains) (Laan et al., 12 Mar 2026, Kisfaludi-Bak et al., 2023), and dynamic range/retroactive ANN (Goodrich et al., 2011).

3. Metrics and Benchmarking

Formally, the core evaluation criteria for dynamic ANN are:

  • Recall@K:

Recall(K)=ApproxNN(q,K)TrueNN(q,K)K\mathrm{Recall}(K) = \frac{|\mathrm{ApproxNN}(q, K) \cap \mathrm{TrueNN}(q, K)|}{K}

where tt0 is the returned result and tt1 denotes ground-truth neighbors for each query tt2.

  • Average Query Latency:

tt3

where tt4 is wall-clock time for query tt5.

  • Average Update Latency:

tt6

with tt7 the cost per insert or delete.

  • Combined Metrics: Product or Pareto frontiers over tt8 (Zeng et al., 2024).
  • Speedup Over Baseline:

tt9

with xDx\in D0 the brute-force baseline (Harwood et al., 2024).

Dynamic benchmarks (e.g., CANDY) incorporate synthetic drift scenarios, micro-batch variation, and pending-write latency breakdowns to reveal ingestion bottlenecks and the impact of semantic shift on recall (Zeng et al., 2024).

4. State-of-the-Art Algorithmic Techniques

Graph-based Methods

  • HNSW and Extensions: Hierarchical graph built incrementally with online insert (xDx\in D1); deletion originally handled only logically. Recent work introduces deterministic and randomized patching (SPatch) via star-mesh transforms and sparsification, provably preserving random-walk hitting times and supporting efficient physical deletion and memory reclamation (Mishra et al., 19 Dec 2025).
  • Dynamic Exploration Graph (DEG): Even-regular undirected graph structure supporting continuous—both incremental insertion and edge-swap-based optimization. Preserves connectivity and achieves state-of-the-art search throughput, superior to HNSW and others at extreme recall under streaming (Hezel et al., 2023).
  • Dynamic Adaptation to Drift: HNSW with hierarchical clustering adapts more robustly under distribution shift, maintaining recall where other methods collapse (Zeng et al., 2024).

Partition/Quantization-based Methods

  • ScaNN/IVFPQ: Fast under batched updates with full or partial re-cluster; optimal at moderate recall and large batch sizes (xDx\in D2). Not natively incremental—insertions amortized over large xDx\in D3 or require full rebuild (Harwood et al., 2024).
  • Dynamic Quantization (CoDEQ): Provably dynamically consistent product quantization under streaming updates with bounded disk I/O, extending PQ methods to dynamic settings while retaining formal (1+xDx\in D4)-ANN accuracy and achieving near-optimal recall/latency trade-offs (Aden-Ali et al., 20 Dec 2025).
  • ML-Driven Indexing: Replaces bucket assignment with neural (MLP) forwarding for up to xDx\in D5 recall boost in some settings, at the cost of increased sensitivity to distribution shift and potentially higher per-query expense (Zeng et al., 2024).

Continuous/Projection-based Methods

  • Dynamic Continuous Indexing: Indexes based on random 1D projections (no partitioning); supports deterministic xDx\in D6 insertion and deletion, with sublinear query time w.r.t. intrinsic dimension; empirically outperforms LSH in high dimensionality (Li et al., 2015).

Specialized Geometric and Retroactive Structures

  • Polygonal/Hyperbolic ANNs: Recent results provide logarithmic-time dynamic ANN structures for non-Euclidean domains (e.g., polygonal with obstacles, hyperbolic), via separator trees and locally sensitive hierarchical quadtrees (Laan et al., 12 Mar 2026, Kisfaludi-Bak et al., 2023).
  • Fully Retroactive ANN: Segmented time trees plus colored dynamic ANN, supporting retrospective queries and updates with xDx\in D7 overhead (Goodrich et al., 2011).

5. Comparative Empirical Findings

Comprehensive empirical benchmarking reveals key trade-offs:

Method (Dynamic) Query Latency (ms) Recall@10 Update Cost Strengths Weaknesses
Brute-Force Scan 0.43 1.00 O(1) Always optimal recall; update-free Not scalable
HNSW (graph-based) 13,466 0.61 O(log N) Stable under drift, scales to high recall, fast incremental insert, robust to streaming Deletion non-trivial, ingestion bottlenecks
IVF-PQ (static) 414.77 0.50 O(N/b) Efficient batched update, tunable batch size Requires full rebuild, not robust to drift
LSH 11.96 0.00 O(1) Simple, very fast updates Poor recall on real tasks
ML-LSH (ML-optimized) 5.82 0.374 O(1) ML can boost recall in low-accuracy/low-load Fragile to drift, higher query time
DEG ≥0.95 O(dh) Connectivity, robust high-recall, steady optimization Param tuning needed, more complex

— Table adapted from multiple experimental sections (Zeng et al., 2024, Hezel et al., 2023).

Key findings:

  • Ingestion cost dominates: For complex ANN structures, the bottleneck in dynamic ingestion—especially under high event rates—is almost always the time spent processing new points, not the raw search cost.
  • Simplicity wins at high rate: Lightweight methods (e.g., scan, LSH) outperform under extreme arrival rates, contrary to static-benchmark intuition.
  • Robustness to drift: Only hierarchical graph-based methods (HNSW) maintain recall under intense distribution shifts.
  • No universal batch size: Micro-batch tuning is algorithm-specific.
  • DL/Quantization optimizations offer gains, but only if coupled with ingestion-aware design.

6. Practical Guidelines and System Design Recommendations

  • For high-frequency, low-latency ingestion (xDx\in D8 small, xDx\in D9 large): Use incremental graph-based methods (HNSW, DEG), with periodic patching or logical deletion reclamation (Mishra et al., 19 Dec 2025, Hezel et al., 2023).
  • For moderate/low-frequency, high-throughput ingestion (xx0 large): Prefer quantization-based approaches (ScaNN, IVFPQ, CoDEQ) with amortized batch rebuilds (Aden-Ali et al., 20 Dec 2025, Harwood et al., 2024).
  • When absolute recall xx1 is required: HNSW and its extensions dominate; expect xx2–xx3 speedups over scan at xx4–xx5 (Harwood et al., 2024).
  • Efficient deletion: In pure read-heavy workloads, tombstoning is viable; frequent deletions demand graph patching schemes (SPatch) (Mishra et al., 19 Dec 2025).
  • Distributional monitoring: Monitor for concept drift and selectively retrain or adapt partitions to maintain query fidelity (Zeng et al., 2024).
  • Expose batch/ingest parameters and automate their tuning via lightweight online experiments.

7. Open Problems and Future Directions

Dynamic ANN remains an active area of research:

  • End-to-end dynamic quantization: Extending provable dynamic consistency and streaming update to more general quantizer classes (beyond median-split), tighter guarantees under adversarial inserts/deletes (Aden-Ali et al., 20 Dec 2025).
  • Joint optimization of ingestion and query layers: Achieving ML-optimized, highly adaptive but robust indexing that balances per-query and per-update latency, potentially with online learning for drift detection/adaption (Zeng et al., 2024).
  • Dynamic ANN in non-Euclidean and constrained spaces: Further development of efficient dynamic structures for polygonal or hyperbolic geometries, spanners, and generalized metric spaces (Kisfaludi-Bak et al., 2023, Laan et al., 12 Mar 2026).
  • Retrospective and temporal ANN: Full support for arbitrary time-range queries, supporting ‘back-in-time’ ANN, is enabled at scale only by fully retroactive data structures (Goodrich et al., 2011).
  • Holistic dynamic evaluation standards: Adoption of continuous-ingestion benchmarking (as in CANDY) that foregrounds ingestion, drift, and joint recall/latency trade-offs as first-class metrics (Zeng et al., 2024).

Dynamic ANN research highlights the distinctive computational demands of real-world, real-time vector search—effective system design requires integrating dynamic update protocols, robust search, and mechanisms for distributional adaptation within a unified, ingestion-sensitive architecture.

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Dynamic Approximate Nearest Neighbours (Dynamic ANN).