B+ANN Disk-Based Index Overview

Updated 26 November 2025

B+ANN Disk-Based Index is a disk-based nearest-neighbor retrieval architecture that merges B+ tree block partitioning with high-dimensional vector search to support scalable vector databases.
It employs recursive K-means++ clustering and contiguous disk page layout to optimize spatial locality and enable efficient batched, SIMDed distance computations.
The design supports generalized queries, including dissimilarity retrieval, while achieving faster build times, reduced cache miss rates, and improved throughput compared to prior ANN methods.

A B+ANN disk-based index is a nearest-neighbor retrieval architecture that merges the block-partitioning and range-search capabilities of a B+ tree with high-dimensional semantic vector search requirements at billion-scale, optimized for memory-efficient, high-throughput operation both in RAM and on SSDs. This approach overcomes multiple scaling, locality, and query-type limitations of traditional in-memory graph indices such as HNSW and disk-based approaches like DiskANN, especially for modern vector database workloads (Tekin et al., 19 Nov 2025).

1. Motivation and Limitations of Prior Disk-Based ANN Indexes

Classic ANN indices, notably HNSW, employ in-memory navigable small-world graphs, resulting in excessive random memory access and fine-grained pairwise distance calculations. These characteristics induce high L1/L2 cache miss rates and undermine SIMD/Batched execution on CPUs and GPUs because computations are unvectorized. HNSW and its disk-based extension, DiskANN, support only similarity queries and involve random disk I/O patterns (multiple disk-page reads per edge traversal), severely limiting scaling and build throughput—requiring tens of hours and hundreds of gigabytes of RAM for billion-vector construction. Furthermore, because their search routines depend on monotonic shortest-path traversals in graph space, they fail to support “most-dissimilar” vector queries and do not facilitate efficient bulk block or edge traversal (Tekin et al., 19 Nov 2025).

2. Hierarchical Block Partitioning and Data Layout

B+ANN employs a recursive, hierarchical $K$ -means++ block partitioning of the input embedding set $\mathcal D = \{\mathbf x_i\}_{i=1}^N$ , with the clustering process repeated until each block contains at most $\tau$ vectors. For each block $B$ , the optimization objective is:

$\min_{C_1,\dots,C_K}\;\sum_{j=1}^K \sum_{\mathbf x\in C_j}\;\|\mathbf x - \mu_j\|_2^2,\,\,\mu_j = \tfrac1{|C_j|}\sum_{\mathbf x\in C_j}\mathbf x$

The resulting leaf blocks $\{B_1,\dots,B_M\}$ , which contain semantically similar vectors, are physically laid out so that blocks with adjacent centroids are mapped to contiguous or neighboring disk pages. This layout design maximizes spatial locality for both in-memory and disk-resident traversals (Tekin et al., 19 Nov 2025).

3. B+ Tree Variant: Structural Details and Operations

The index topology is a high-fanout B+ tree variant where:

Inner nodes store cluster centroids as keys $\mathbf c^{(\ell)} \in \mathbb{R}^d$ .
Leaf nodes store matrices of the full vectors $\mathbf V \in \mathbb{R}^{d \times f_\mathrm{leaf}}$ , supporting up to $f_{\rm leaf} \approx 1024$ vectors per block and $f_{\rm inner} \approx 64$ child centroids per inner node.

Insertion and node split logic follow an extension of B+ tree algorithms: insert the centroid and block pointer, split the node using a recursive $K$ -means partition when fanout exceeds the threshold, and promote block centroids to maintain balance:

function InsertBlock(node, c_block, V_block):
  node.keys ← node.keys ∪ {c_block}
  node.children ← node.children ∪ {V_block}
  if |node.keys| > f_inner:
    (left,right) ← SplitKMeans(node.children, 2)
    Promote centroid(left), centroid(right) to parent

Tree traversal for search proceeds via vector–matrix dot products to find the best-matching centroid, followed by block-level vector scan (Tekin et al., 19 Nov 2025).

4. Edge- and Block-Based Hybrid Traversal for Query Processing

The query procedure combines block-oriented B+ tree traversal (for high-throughput SIMD/Batched distance computations) and a lightweight skip-edge graph refinement (on leaf nodes) for candidate expansion. The algorithm is:

function Search(q, k, β, d_edge):
  PQ ← {root}
  while not PQ.empty():
    node ← PQ.pop_best()
    if node is internal:
      D ← compute 1 - q′C_node   # batch distances to centroids
      top ← argmin(D, β)
      PQ.push(node.children[top])
    else:
      Dleaf ← compute 1 - q′V_node
      candidates ← leaf.vectors[argmin(Dleaf, k)]
      GreedyGraphWalk(q, candidates, k, d_edge)
      return final k vectors

Block-level phase costs $O(\log_f N)$ contiguous block loads, allowing distance computations as a single batched matrix operation. The graph refinement traverses local skip edges for each candidate but is constrained to nearby blocks, keeping extra disk I/O bounded to $O(k + d_\text{edge})$ hops (Tekin et al., 19 Nov 2025).

5. Extended Query Semantics: Dissimilarity and Streaming Support

Unlike pure nearest-neighbor indices, B+ANN natively supports dissimilarity queries—retrieving the $k$ farthest vectors from a query $\mathbf q$ —by navigating the B+ tree to blocks whose centroids are most distant from $\mathbf q$ and then scanning leaf vectors in descending order of distance. This approach sidesteps the oscillatory convergence problem in graph-based indexing, enabling a generalized query model suitable for maximally novel or diversity-oriented retrieval, which is not feasible in HNSW, DiskANN, or other mainstream ANN indices (Tekin et al., 19 Nov 2025).

6. Performance Analysis: Recall, Throughput, and Resource Metrics

Empirical evaluation on SIFT-1B and other datasets yields the following:

Algorithm	RAM Build Time	Build RAM	QPS (SSD)	Recall@10	L1 Miss Rate	Disk Build Speedup
HNSW (mem)	—	—	200k	95%	High	—
DiskANN	48h	500GB	1.2M	95%	19.8%	1×
B+ANN	2h	150GB	1.3M	95%	15.9%	24×

Recall is computed as

$\text{Recall}@k = \frac{1}{N}\sum_{i=1}^N \mathbf{1}(\text{trueNN}_i \subseteq \text{retrieved}_i)$

and throughput (QPS) as $\mathrm{QPS} = \frac{\#\,\mathrm{queries}}{\mathrm{total\;latency}}$ .

B+ANN improves cache locality (L1 miss rate reduced by 19.2%), achieves 24× faster disk build time than DiskANN, and boosts QPS by $\approx$ 20% under matched recall (Tekin et al., 19 Nov 2025).

7. Computational Complexity, Memory Overhead, and Practical Trade-Offs

Build-time complexity is $O(N\log N)$ for tree construction and, with practical fanout parameters, $O(N\log N)$ for skip-edge graph creation. Total storage includes $N$ vectors, level centroids ( $\approx$ 10% overhead), and skip edges (2–4/vertex).

Design trade-offs include:

Larger $f_{\rm leaf}$ (block size): lowers tree depth, increases scan size per I/O, favoring GPU/BLAS vectorization and improved cache efficiency.
Smaller $f_{\rm leaf}$ : produces deeper trees, shorter block scans—better tuned for latency-critical, memory-constrained environments.

Streaming updates are facilitated via block partitioning and local splits, while dissimilarity queries are handled natively at the leaf level (Tekin et al., 19 Nov 2025).

Competing systems such as DiskANN++ (Ni et al., 2023), BAMG (Li et al., 3 Sep 2025), and LSM-VEC (Zhong et al., 22 May 2025) propose improvements in page-based embeddings, monotonic graph pruning, and dynamic update scalability, respectively. However, B+ANN remains unique in its explicit B+ tree block organization, in its blend of block-level spatial and temporal locality for disk-based ANN, and in supporting generalized query semantics and streaming scalability at extreme data scale.

In summary, B+ANN introduces a hybrid tree-graph disk index with block-centric partitioning and traversal, achieving superior recall, throughput, build time, cache locality, and semantic query generality over previous approaches (Tekin et al., 19 Nov 2025).