Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
126 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Optimistic Query Routing in Clustering-based Approximate Maximum Inner Product Search (2405.12207v3)

Published 20 May 2024 in cs.LG and cs.IR

Abstract: Clustering-based nearest neighbor search is an effective method in which points are partitioned into geometric shards to form an index, with only a few shards searched during query processing to find a set of top-$k$ vectors. Even though the search efficacy is heavily influenced by the algorithm that identifies the shards to probe, it has received little attention in the literature. This work bridges that gap by studying routing in clustering-based maximum inner product search. We unpack existing routers and notice the surprising contribution of optimism. We then take a page from the sequential decision making literature and formalize that insight following the principle of ``optimism in the face of uncertainty.'' In particular, we present a framework that incorporates the moments of the distribution of inner products within each shard to estimate the maximum inner product. We then present an instance of our algorithm that uses only the first two moments to reach the same accuracy as state-of-the-art routers such as ScaNN by probing up to $50\%$ fewer points on benchmark datasets. Our algorithm is also space-efficient: we design a sketch of the second moment whose size is independent of the number of points and requires $\mathcal{O}(1)$ vectors per shard.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (36)
  1. The fast johnson–lindenstrauss transform and approximate nearest neighbors. SIAM Journal on computing, 39(1):302–322, 2009.
  2. An almost optimal unrestricted fast johnson-lindenstrauss transform. ACM Transactions on Algorithms (TALG), 9(3):1–12, 2013.
  3. Quicker adc: Unlocking the hidden potential of product quantization with simd. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(5):1666–1677, 5 2021.
  4. Clustering is efficient for approximate maximum inner product search, 2015.
  5. The inverted multi-index. In 2012 IEEE Conference on Computer Vision and Pattern Recognition, pages 3069–3076, 2012.
  6. Jon Louis Bentley. Multidimensional binary search trees used for associative searching. Communications of the ACM, 18(9):509–517, 9 1975.
  7. Sublinear time spectral density estimation. In Proceedings of the 54th Annual ACM SIGACT Symposium on Theory of Computing, pages 1144–1157, 2022.
  8. Sebastian Bruch. Foundations of Vector Retrieval. Springer Nature Switzerland, 2024.
  9. Bridging dense and sparse maximum inner product search. ACM Transactions on Information Systems, 2024. (to appear).
  10. Finding near neighbors through cluster pruning. In Proceedings of the 26th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pages 103–112, 2007.
  11. Randomized partition trees for nearest neighbor search. Algorithmica, 72(1):237–263, 5 2015.
  12. Concept decompositions for large sparse text data using clustering. Machine Learning, 42(1):143–175, January 2001.
  13. The faiss library, 2024.
  14. Optimized product quantization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(4):744–755, 2014.
  15. Accelerating large-scale inference with anisotropic vector quantization. In Proceedings of the 37th International Conference on Machine Learning, 2020.
  16. Approximate nearest neighbors: Towards removing the curse of dimensionality. In Proceedings of the 30th Annual ACM Symposium on Theory of Computing, pages 604–613, 1998.
  17. Ood-diskann: Efficient and scalable graph anns for out-of-distribution queries, 2022.
  18. Diskann: Fast accurate billion-point nearest neighbor search on a single node. In Advances in Neural Information Processing Systems, volume 32, 2019.
  19. Product quantization for nearest neighbor search. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(1):117–128, 2011.
  20. Billion-scale similarity search with gpus. IEEE Transactions on Big Data, 7(3):535–547, 2021.
  21. Locally optimized product quantization for approximate nearest neighbor search. In 2014 IEEE Conference on Computer Vision and Pattern Recognition, pages 2329–2336, 2014.
  22. Spectrum estimation from samples. The Annals of Statistics, 45(5):2218–2247, 2017.
  23. Natural questions: A benchmark for question answering research. Transactions of the Association for Computational Linguistics, 7:452–466, 2019.
  24. Bandit algorithms. Cambridge University Press, 2020.
  25. Efficient and robust approximate nearest neighbor search using hierarchical navigable small world graphs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(4):824–836, 4 2020.
  26. Non-metric similarity graphs for maximum inner product search. In Advances in Neural Information Processing Systems, volume 31. Curran Associates, Inc., 2018.
  27. Ms marco: A human generated machine reading comprehension dataset, November 2016.
  28. Cartesian k-means. In Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, pages 3017–3024, 2013.
  29. Karl Pearson. Method of moments and method of maximum likelihood. Biometrika, 28(1/2):34–59, 1936.
  30. GloVe: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, pages 1532–1543, Doha, Qatar, October 2014.
  31. Results of the neurips’21 challenge on billion-scale approximate nearest neighbor search. In Proceedings of the NeurIPS 2021 Competitions and Demonstrations Track, volume 176 of Proceedings of Machine Learning Research, pages 177–189, Dec 2022.
  32. Freshdiskann: A fast and accurate graph-based ann index for streaming similarity search, 2021.
  33. A learning-to-rank formulation of clustering-based approximate nearest neighbor search. In Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2024. (to appear).
  34. David P. Woodruff. Sketching as a tool for numerical linear algebra. Foundations and Trends in Theoretical Computer Science, 10(1–2):1–157, Oct 2014. ISSN 1551-305X.
  35. Multiscale quantization for fast similarity search. In Advances in Neural Information Processing Systems, volume 30, 2017.
  36. Efficient indexing of billion-scale datasets of deep descriptors. In 2016 IEEE Conference on Computer Vision and Pattern Recognition, pages 2055–2063, 2016.
Citations (1)

Summary

We haven't generated a summary for this paper yet.