Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
157 tokens/sec
GPT-4o
43 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Parallel Batch-Dynamic $k$d-Trees (2112.06188v1)

Published 12 Dec 2021 in cs.DS and cs.DB

Abstract: $k$d-trees are widely used in parallel databases to support efficient neighborhood/similarity queries. Supporting parallel updates to $k$d-trees is therefore an important operation. In this paper, we present BDL-tree, a parallel, batch-dynamic implementation of a $k$d-tree that allows for efficient parallel $k$-NN queries over dynamically changing point sets. BDL-trees consist of a log-structured set of $k$d-trees which can be used to efficiently insert or delete batches of points in parallel with polylogarithmic depth. Specifically, given a BDL-tree with $n$ points, each batch of $B$ updates takes $O(B\log2{(n+B)})$ amortized work and $O(\log(n+B)\log\log{(n+B)})$ depth (parallel time). We provide an optimized multicore implementation of BDL-trees. Our optimizations include parallel cache-oblivious $k$d-tree construction and parallel bloom filter construction. Our experiments on a 36-core machine with two-way hyper-threading using a variety of synthetic and real-world datasets show that our implementation of BDL-tree achieves a self-relative speedup of up to $34.8\times$ ($28.4\times$ on average) for batch insertions, up to $35.5\times$ ($27.2\times$ on average) for batch deletions, and up to $46.1\times$ ($40.0\times$ on average) for $k$-nearest neighbor queries. In addition, it achieves throughputs of up to 14.5 million updates/second for batch-parallel updates and 6.7 million queries/second for $k$-NN queries. We compare to two baseline $k$d-tree implementations and demonstrate that BDL-trees achieve a good tradeoff between the two baseline options for implementing batch updates.

Citations (4)

Summary

We haven't generated a summary for this paper yet.