Papers
Topics
Authors
Recent
Search
2000 character limit reached

Accelerating Graph-based Vector Search via Delayed-Synchronization Traversal

Published 18 Jun 2024 in cs.AR | (2406.12385v1)

Abstract: Vector search systems are indispensable in LLM serving, search engines, and recommender systems, where minimizing online search latency is essential. Among various algorithms, graph-based vector search (GVS) is particularly popular due to its high search performance and quality. To efficiently serve low-latency GVS, we propose a hardware-algorithm co-design solution including Falcon, a GVS accelerator, and Delayed-Synchronization Traversal (DST), an accelerator-optimized graph traversal algorithm. Falcon implements high-performance GVS operators and reduces memory accesses with an on-chip Bloom filter to track search states. DST improves search performance and quality by relaxing the graph traversal order to maximize accelerator utilization. Evaluation across various graphs and datasets shows that our Falcon prototype on FPGAs, coupled with DST, achieves up to 4.3$\times$ and 19.5$\times$ speedups in latency and up to 8.0$\times$ and 26.9$\times$ improvements in energy efficiency over CPU and GPU-based GVS systems. The remarkable efficiency of Falcon and DST demonstrates their potential to become the standard solutions for future GVS acceleration.

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.