Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
129 tokens/sec
GPT-4o
28 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

AdANNS: A Framework for Adaptive Semantic Search (2305.19435v2)

Published 30 May 2023 in cs.LG and cs.IR

Abstract: Web-scale search systems learn an encoder to embed a given query which is then hooked into an approximate nearest neighbor search (ANNS) pipeline to retrieve similar data points. To accurately capture tail queries and data points, learned representations typically are rigid, high-dimensional vectors that are generally used as-is in the entire ANNS pipeline and can lead to computationally expensive retrieval. In this paper, we argue that instead of rigid representations, different stages of ANNS can leverage adaptive representations of varying capacities to achieve significantly better accuracy-compute trade-offs, i.e., stages of ANNS that can get away with more approximate computation should use a lower-capacity representation of the same data point. To this end, we introduce AdANNS, a novel ANNS design framework that explicitly leverages the flexibility of Matryoshka Representations. We demonstrate state-of-the-art accuracy-compute trade-offs using novel AdANNS-based key ANNS building blocks like search data structures (AdANNS-IVF) and quantization (AdANNS-OPQ). For example on ImageNet retrieval, AdANNS-IVF is up to 1.5% more accurate than the rigid representations-based IVF at the same compute budget; and matches accuracy while being up to 90x faster in wall-clock time. For Natural Questions, 32-byte AdANNS-OPQ matches the accuracy of the 64-byte OPQ baseline constructed using rigid representations -- same accuracy at half the cost! We further show that the gains from AdANNS translate to modern-day composite ANNS indices that combine search structures and quantization. Finally, we demonstrate that AdANNS can enable inference-time adaptivity for compute-aware search on ANNS indices built non-adaptively on matryoshka representations. Code is open-sourced at https://github.com/RAIVNLab/AdANNS.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (53)
  1. Ann-benchmarks: A benchmarking tool for approximate nearest neighbor algorithms. Information Systems, 87:101374, 2020.
  2. Y. Bengio. Deep learning of representations for unsupervised and transfer learning. In Proceedings of ICML workshop on unsupervised and transfer learning, pages 17–36. JMLR Workshop and Conference Proceedings, 2012.
  3. E. Bernhardsson. Annoy: Approximate Nearest Neighbors in C++/Python, 2018. URL https://pypi.org/project/annoy/. Python package version 1.13.0.
  4. S. Brin and L. Page. The anatomy of a large-scale hypertextual web search engine. Computer networks and ISDN systems, 30(1-7):107–117, 1998.
  5. D. Cai. A revisit of hashing algorithms for approximate nearest neighbor search. IEEE Transactions on Knowledge and Data Engineering, 33(6):2337–2348, 2021. doi: 10.1109/TKDE.2019.2953897.
  6. Differentiable product quantization for end-to-end embedding compression. In International Conference on Machine Learning, pages 1617–1626. PMLR, 2020.
  7. K. L. Clarkson. An algorithm for approximate closest-point queries. In Proceedings of the tenth annual symposium on Computational geometry, pages 160–164, 1994.
  8. Locality-sensitive hashing scheme based on p-stable distributions. In Proceedings of the twentieth annual symposium on Computational geometry, pages 253–262, 2004.
  9. J. Dean. Challenges in building large-scale information retrieval systems. In Keynote of the 2nd ACM International Conference on Web Search and Data Mining (WSDM), volume 10, 2009.
  10. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.
  11. Matformer: Nested transformer for elastic inference. arXiv preprint arxiv:2310.07707, 2023.
  12. An algorithm for finding best matches in logarithmic expected time. ACM Transactions on Mathematical Software (TOMS), 3(3):209–226, 1977.
  13. Optimized product quantization for approximate nearest neighbor search. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2946–2953, 2013.
  14. G. Golub and W. Kahan. Calculating the singular values and pseudo-inverse of a matrix. Journal of the Society for Industrial and Applied Mathematics, Series B: Numerical Analysis, 2(2):205–224, 1965.
  15. R. Gray. Vector quantization. IEEE Assp Magazine, 1(2):4–29, 1984.
  16. Accelerating large-scale inference with anisotropic vector quantization. In International Conference on Machine Learning, pages 3887–3896. PMLR, 2020.
  17. End-to-end learning to index and search in large output spaces. arXiv preprint arXiv:2210.08410, 2022.
  18. On the difficulty of nearest neighbor search. In International Conference on Machine Learning (ICML), 2012.
  19. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
  20. Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9729–9738, 2020.
  21. P. Indyk and R. Motwani. Approximate nearest neighbors: towards removing the curse of dimensionality. In Proceedings of the thirtieth annual ACM symposium on Theory of computing, pages 604–613, 1998.
  22. Diskann: Fast accurate billion-point nearest neighbor search on a single node. Advances in Neural Information Processing Systems, 32, 2019.
  23. Product quantization for nearest neighbor search. IEEE transactions on pattern analysis and machine intelligence, 33(1):117–128, 2010.
  24. Billion-scale similarity search with GPUs. IEEE Transactions on Big Data, 7(3):535–547, 2019.
  25. W. B. Johnson. Extensions of lipschitz mappings into a hilbert space. Contemp. Math., 26:189–206, 1984.
  26. I. T. Jolliffe and J. Cadima. Principal component analysis: a review and recent developments. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 374(2065):20150202, 2016.
  27. Dense passage retrieval for open-domain question answering. arXiv preprint arXiv:2004.04906, 2020.
  28. Do better imagenet models transfer better? In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2661–2671, 2019.
  29. The case for learned index structures. In Proceedings of the 2018 international conference on management of data, pages 489–504, 2018.
  30. Llc: Accurate, multi-purpose learnt low-dimensional binary codes. Advances in Neural Information Processing Systems, 34:23900–23913, 2021.
  31. Matryoshka representation learning. In Advances in Neural Information Processing Systems, December 2022.
  32. Natural questions: a benchmark for question answering research. Transactions of the Association for Computational Linguistics, 7:453–466, 2019.
  33. D. A. Levin and Y. Peres. Markov chains and mixing times, volume 107. American Mathematical Soc., 2017.
  34. Approximate nearest neighbor search on high dimensional data—experiments, analyses, and improvement. IEEE Transactions on Knowledge and Data Engineering, 2020.
  35. A convnet for the 2020s. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11976–11986, 2022.
  36. S. Lloyd. Least squares quantization in pcm. IEEE transactions on information theory, 28(2):129–137, 1982.
  37. Approximate nearest neighbor algorithm based on navigable small world graphs. Information Systems, 45:61–68, 2014.
  38. Y. A. Malkov and D. Yashunin. Efficient and robust approximate nearest neighbor search using hierarchical navigable small world graphs. IEEE Transactions on Pattern Analysis & Machine Intelligence, 42(04):824–836, 2020.
  39. Multivariate analysis. Probability and Mathematical Statistics, 1979.
  40. P. Nayak. Understanding searches better than ever before. Google AI Blog, 2019. URL https://blog.google/products/search/search-language-understanding-bert/.
  41. Text and code embeddings by contrastive pre-training. arXiv preprint arXiv:2201.10005, 2022.
  42. Learning transferable visual models from natural language supervision. In International Conference on Machine Learning, pages 8748–8763. PMLR, 2021.
  43. Forward compatible training for representation learning. arXiv preprint arXiv:2112.02805, 2021.
  44. Do imagenet classifiers generalize to imagenet? In International Conference on Machine Learning, pages 5389–5400. PMLR, 2019.
  45. Imagenet large scale visual recognition challenge. International journal of computer vision, 115(3):211–252, 2015.
  46. R. Salakhutdinov and G. Hinton. Semantic hashing. International Journal of Approximate Reasoning, 50(7):969–978, 2009.
  47. Results of the neurips’21 challenge on billion-scale approximate nearest neighbor search. arXiv preprint arXiv:2205.03763, 2022.
  48. J. Sivic and A. Zisserman. Video google: A text retrieval approach to object matching in videos. In Computer Vision, IEEE International Conference on, volume 3, pages 1470–1470. IEEE Computer Society, 2003.
  49. The implicit bias of gradient descent on separable data. The Journal of Machine Learning Research, 19(1):2822–2878, 2018.
  50. C. Waldburger. As search needs evolve, microsoft makes ai tools for better search available to researchers and developers. Microsoft AI Blog, 2019. URL https://blogs.microsoft.com/ai/bing-vector-search/.
  51. A comprehensive survey and experimental comparison of graph-based approximate nearest neighbor search. Proceedings of the VLDB Endowment, 14(11):1964–1978, 2021.
  52. A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. In VLDB, volume 98, pages 194–205, 1998.
  53. Managing gigabytes: compressing and indexing documents and images. Morgan Kaufmann, 1999.
Citations (2)

Summary

  • The paper introduces the AdANNS framework, which adaptively adjusts representation capacities in the ANNS pipeline to enhance accuracy-compute trade-offs.
  • It demonstrates that adaptive IVF configurations can deliver up to 1.5% improved accuracy and 90 times faster processing compared to traditional methods.
  • The framework’s efficient quantization approach shows that a 32-byte representation can match the accuracy of a 64-byte standard while reducing computational costs.

An Adaptive Framework for Semantic Search

The paper presents a novel approach to improve the accuracy-compute trade-off in semantic search systems by introducing the AdANNS framework, which leverages Matryoshka Representations (MR) for adaptive approximate nearest neighbor search (ANNS). By utilizing adaptive embeddings, the authors break away from the traditional use of rigid, high-dimensional representations, which are often computationally expensive. The main contribution of this work is the proposition that various stages of the ANNS pipeline can employ representations of varying capacities for enhanced efficiency, thereby optimizing the accuracy-compute trade-offs.

Key Contributions and Findings

  1. Dance: Adaptive ANNS Framework:
    • The authors propose AdANNS, a framework utilizing Matryoshka Representations to improve search data structures and quantization methods, achieving better accuracy-compute trade-offs than existing solutions.
  2. Advancements in ANNS Building Blocks:
    • AdANNS is implemented in two primary components: the search data structure (IVF) and distance computation method. The framework proposes {-IVF, which improves the traditional inverted file index by using adaptive representations. This approach was shown to be up to 1.5% more accurate than standard rigid representation-based methods and up to 90 times faster.
  3. Efficient Quantization Techniques:
    • AdANNS introduces {-OPQ, which significantly outperforms the baseline OPQ by utilizing adaptive embedding dimensions. The paper demonstrates that a 32-byte {-OPQ can achieve the accuracy of a 64-byte OPQ while being twice as fast.
  4. Generalization to Modern Composite Indices:
    • The authors extend AdANNS to composite indices such as IVFOPQ, showing that the approach not only delivers better results but also significantly reduces computational costs. Specifically, the integration of AdANNS with DiskANN provided similar accuracy at half the cost.
  5. Empirical Validation:
    • Through extensive experimentation on datasets such as ImageNet-1K and Natural Questions, the authors demonstrate that adaptive Matryoshka Representations provide better trade-offs over rigid counterparts in both clustering and quantization tasks.

Implications and Future Directions

The paper significantly advances the ANNS field by demonstrating how learned representations can be adaptively leveraged at different stages of the pipeline to achieve optimal performance. The results suggest that adaptive representations are better aligned for search tasks than traditional rigid embeddings, offering a new avenue for developing more efficient search algorithms. Future work could explore further the integration of adaptive features in different types of ANNS indices and test scalability to even larger datasets and real-world applications.

By setting a new standard for the adoption of adaptive methods in semantic search, this paper paves the way for more efficient and cost-effective search systems in practice. The combination of theoretical insights with practical validation marks a noticeable shift towards more dynamic and flexible retrieval architectures that respond to varying resource constraints, ultimately reflecting broader trends in AI system design that prioritize adaptability and efficiency.

Github Logo Streamline Icon: https://streamlinehq.com