Dynamic Approximate Nearest Neighbours

Updated 15 April 2026

Dynamic ANN is a class of data structures and algorithms that support efficient approximate nearest neighbor queries under continuous insertions, deletions, and modifications.
These methods integrate graph-based, partition-based, and continuous indexing strategies to balance recall, query latency, and update efficiency, with systems like HNSW and IVFPQ as prominent examples.
Empirical benchmarks reveal that optimizing ingestion cost and memory reclamation is key to maintaining high recall and responsiveness in streaming, high-dimensional search applications.

Dynamic Approximate Nearest Neighbours (Dynamic ANN) is the class of data structures and algorithms designed to support efficient approximate nearest neighbor (ANN) queries under the full spectrum of dynamic operations—continuous insertions, deletions, and modifications—thereby enabling responsiveness and adaptability in large-scale, evolving vector databases and high-dimensional search systems. In contrast to traditional static ANN, dynamic ANN explicitly addresses update latency and index freshness under real-time and streaming data ingestion, a necessity in contemporary AI/ML applications such as retrieval-augmented generation and live recommendation systems (Zeng et al., 2024).

1. Fundamental Principles and Problem Formulation

Dynamic ANN seeks a trade-off between query accuracy (usually recall@K), query throughput/latency, and update efficiency. Given a data set $D$ in a metric or vector space and a fast-changing workload of insertions, deletions, and queries, the goal is to support, at each time $t$ , queries of the form: "find a vector $x\in D$ (possibly excluding recently deleted points) such that $x$ is among the $K$ approximate nearest neighbors to a query $q$ ", with guarantees on approximation ratio or empirical recall (Harwood et al., 2024). Dynamicity is operationalized via:

Insertion: Online addition of new points, typically at high/variable ingestion rates.
Deletion: Removal or logical invalidation of points, with increasing focus on true memory reclamation and maintaining connectivity (Mishra et al., 19 Dec 2025).
Modification: Defined as a composite delete + insert.

Dynamic ANN workload models (e.g., CANDY) benchmark systems under realistic ingestion streams (e.g., fixed or bursty arrival at $\lambda=4000$ vectors/s), admission control (buffer overflow/data dropping), and micro-batching (batch size $B$ parameterizes freshness-latency trade-off) (Zeng et al., 2024).

2. Main Algorithmic Frameworks

Dynamic ANN methods decompose primarily into:

Ranging/Partition-based: Locality-sensitive hashing (LSH), partition trees, and vector quantization (PQ/IVFPQ/ScaNN). These methods partition the metric space into buckets or clusters for fast candidate pruning. Incremental adaptation is typically possible only for the data-independent (hash-based) case; product quantization and clustering-based schemes often require costly re-training or batch amortization for centroids/codebooks (Aden-Ali et al., 20 Dec 2025, Harwood et al., 2024).
Graph-based/Navigation-based: Proximity and small-world graphs (e.g., HNSW, NSW, DEG). These structures incrementally link new data via local search or edge swaps, maintaining graph connectivity and navigability for queries (Hezel et al., 2023, Mishra et al., 19 Dec 2025). Graph-based methods dominate at high recall and enable native $O(\log N)$ incremental insertion, but true deletion is challenging: naive logical deletion ("tombstones") is memory-intensive and degrades search, while structural patching (e.g., SPatch) can reclaim memory and preserve search efficiency (Mishra et al., 19 Dec 2025).

Additional paradigms include continuous indexes (random projections; DCI) that avoid partitioning and probabilistically guarantee sublinear query/update complexity (Li et al., 2015), geometric data structures (polygonal or hyperbolic domains) (Laan et al., 12 Mar 2026, Kisfaludi-Bak et al., 2023), and dynamic range/retroactive ANN (Goodrich et al., 2011).

3. Metrics and Benchmarking

Formally, the core evaluation criteria for dynamic ANN are:

Recall@K:

$\mathrm{Recall}(K) = \frac{|\mathrm{ApproxNN}(q, K) \cap \mathrm{TrueNN}(q, K)|}{K}$

where $t$ 0 is the returned result and $t$ 1 denotes ground-truth neighbors for each query $t$ 2.

Average Query Latency:

$t$ 3

where $t$ 4 is wall-clock time for query $t$ 5.

Average Update Latency:

$t$ 6

with $t$ 7 the cost per insert or delete.

Combined Metrics: Product or Pareto frontiers over $t$ 8 (Zeng et al., 2024).
Speedup Over Baseline:

$t$ 9

with $x\in D$ 0 the brute-force baseline (Harwood et al., 2024).

Dynamic benchmarks (e.g., CANDY) incorporate synthetic drift scenarios, micro-batch variation, and pending-write latency breakdowns to reveal ingestion bottlenecks and the impact of semantic shift on recall (Zeng et al., 2024).

4. State-of-the-Art Algorithmic Techniques

Graph-based Methods

HNSW and Extensions: Hierarchical graph built incrementally with online insert ( $x\in D$ 1); deletion originally handled only logically. Recent work introduces deterministic and randomized patching (SPatch) via star-mesh transforms and sparsification, provably preserving random-walk hitting times and supporting efficient physical deletion and memory reclamation (Mishra et al., 19 Dec 2025).
Dynamic Exploration Graph (DEG): Even-regular undirected graph structure supporting continuous—both incremental insertion and edge-swap-based optimization. Preserves connectivity and achieves state-of-the-art search throughput, superior to HNSW and others at extreme recall under streaming (Hezel et al., 2023).
Dynamic Adaptation to Drift: HNSW with hierarchical clustering adapts more robustly under distribution shift, maintaining recall where other methods collapse (Zeng et al., 2024).

Partition/Quantization-based Methods

ScaNN/IVFPQ: Fast under batched updates with full or partial re-cluster; optimal at moderate recall and large batch sizes ( $x\in D$ 2). Not natively incremental—insertions amortized over large $x\in D$ 3 or require full rebuild (Harwood et al., 2024).
Dynamic Quantization (CoDEQ): Provably dynamically consistent product quantization under streaming updates with bounded disk I/O, extending PQ methods to dynamic settings while retaining formal (1+ $x\in D$ 4)-ANN accuracy and achieving near-optimal recall/latency trade-offs (Aden-Ali et al., 20 Dec 2025).
ML-Driven Indexing: Replaces bucket assignment with neural (MLP) forwarding for up to $x\in D$ 5 recall boost in some settings, at the cost of increased sensitivity to distribution shift and potentially higher per-query expense (Zeng et al., 2024).

Continuous/Projection-based Methods

Dynamic Continuous Indexing: Indexes based on random 1D projections (no partitioning); supports deterministic $x\in D$ 6 insertion and deletion, with sublinear query time w.r.t. intrinsic dimension; empirically outperforms LSH in high dimensionality (Li et al., 2015).

Specialized Geometric and Retroactive Structures

Polygonal/Hyperbolic ANNs: Recent results provide logarithmic-time dynamic ANN structures for non-Euclidean domains (e.g., polygonal with obstacles, hyperbolic), via separator trees and locally sensitive hierarchical quadtrees (Laan et al., 12 Mar 2026, Kisfaludi-Bak et al., 2023).
Fully Retroactive ANN: Segmented time trees plus colored dynamic ANN, supporting retrospective queries and updates with $x\in D$ 7 overhead (Goodrich et al., 2011).

5. Comparative Empirical Findings

Comprehensive empirical benchmarking reveals key trade-offs:

Method (Dynamic)	Query Latency (ms)	Recall@10	Update Cost	Strengths	Weaknesses
Brute-Force Scan	0.43	1.00	O(1)	Always optimal recall; update-free	Not scalable
HNSW (graph-based)	13,466	0.61	O(log N)	Stable under drift, scales to high recall, fast incremental insert, robust to streaming	Deletion non-trivial, ingestion bottlenecks
IVF-PQ (static)	414.77	0.50	O(N/b)	Efficient batched update, tunable batch size	Requires full rebuild, not robust to drift
LSH	11.96	0.00	O(1)	Simple, very fast updates	Poor recall on real tasks
ML-LSH (ML-optimized)	5.82	0.374	O(1)	ML can boost recall in low-accuracy/low-load	Fragile to drift, higher query time
DEG	—	≥0.95	O(dh)	Connectivity, robust high-recall, steady optimization	Param tuning needed, more complex

— Table adapted from multiple experimental sections (Zeng et al., 2024, Hezel et al., 2023).

Key findings:

Ingestion cost dominates: For complex ANN structures, the bottleneck in dynamic ingestion—especially under high event rates—is almost always the time spent processing new points, not the raw search cost.
Simplicity wins at high rate: Lightweight methods (e.g., scan, LSH) outperform under extreme arrival rates, contrary to static-benchmark intuition.
Robustness to drift: Only hierarchical graph-based methods (HNSW) maintain recall under intense distribution shifts.
No universal batch size: Micro-batch tuning is algorithm-specific.
DL/Quantization optimizations offer gains, but only if coupled with ingestion-aware design.

6. Practical Guidelines and System Design Recommendations

For high-frequency, low-latency ingestion ( $x\in D$ 8 small, $x\in D$ 9 large): Use incremental graph-based methods (HNSW, DEG), with periodic patching or logical deletion reclamation (Mishra et al., 19 Dec 2025, Hezel et al., 2023).
For moderate/low-frequency, high-throughput ingestion ( $x$ 0 large): Prefer quantization-based approaches (ScaNN, IVFPQ, CoDEQ) with amortized batch rebuilds (Aden-Ali et al., 20 Dec 2025, Harwood et al., 2024).
When absolute recall $x$ 1 is required: HNSW and its extensions dominate; expect $x$ 2– $x$ 3 speedups over scan at $x$ 4– $x$ 5 (Harwood et al., 2024).
Efficient deletion: In pure read-heavy workloads, tombstoning is viable; frequent deletions demand graph patching schemes (SPatch) (Mishra et al., 19 Dec 2025).
Distributional monitoring: Monitor for concept drift and selectively retrain or adapt partitions to maintain query fidelity (Zeng et al., 2024).
Expose batch/ingest parameters and automate their tuning via lightweight online experiments.

7. Open Problems and Future Directions

Dynamic ANN remains an active area of research:

End-to-end dynamic quantization: Extending provable dynamic consistency and streaming update to more general quantizer classes (beyond median-split), tighter guarantees under adversarial inserts/deletes (Aden-Ali et al., 20 Dec 2025).
Joint optimization of ingestion and query layers: Achieving ML-optimized, highly adaptive but robust indexing that balances per-query and per-update latency, potentially with online learning for drift detection/adaption (Zeng et al., 2024).
Dynamic ANN in non-Euclidean and constrained spaces: Further development of efficient dynamic structures for polygonal or hyperbolic geometries, spanners, and generalized metric spaces (Kisfaludi-Bak et al., 2023, Laan et al., 12 Mar 2026).
Retrospective and temporal ANN: Full support for arbitrary time-range queries, supporting ‘back-in-time’ ANN, is enabled at scale only by fully retroactive data structures (Goodrich et al., 2011).
Holistic dynamic evaluation standards: Adoption of continuous-ingestion benchmarking (as in CANDY) that foregrounds ingestion, drift, and joint recall/latency trade-offs as first-class metrics (Zeng et al., 2024).

Dynamic ANN research highlights the distinctive computational demands of real-world, real-time vector search—effective system design requires integrating dynamic update protocols, robust search, and mechanisms for distributional adaptation within a unified, ingestion-sensitive architecture.