AiSAQ: All-in-Storage ANNS with Product Quantization for DRAM-free Information Retrieval (2404.06004v1)
Abstract: In approximate nearest neighbor search (ANNS) methods based on approximate proximity graphs, DiskANN achieves good recall-speed balance for large-scale datasets using both of RAM and storage. Despite it claims to save memory usage by loading compressed vectors by product quantization (PQ), its memory usage increases in proportion to the scale of datasets. In this paper, we propose All-in-Storage ANNS with Product Quantization (AiSAQ), which offloads the compressed vectors to storage. Our method achieves $\sim$10 MB memory usage in query search even with billion-scale datasets with minor performance degradation. AiSAQ also reduces the index load time before query search, which enables the index switch between muitiple billion-scale datasets and significantly enhances the flexibility of retrieval-augmented generation (RAG). This method is applicable to all graph-based ANNS algorithms and can be combined with higher-spec ANNS methods in the future.
- Datasets for approximate nearest neighbor search, 2010.
- Approximate nearest neighbor queries in fixed dimensions. In SODA (1993), vol. 93, Citeseer, pp. 271–280.
- Weaviate, 2022.
- Filtered-diskann: Graph algorithms for approximate nearest neighbor search with filters. In Proceedings of the ACM Web Conference 2023 (2023), pp. 3406–3416.
- Diskann: Fast accurate billion-point nearest neighbor search on a single node. Advances in Neural Information Processing Systems 32 (2019).
- Product quantization for nearest neighbor search. IEEE transactions on pattern analysis and machine intelligence 33, 1 (2010), 117–128.
- Retrieval-augmented generation for knowledge-intensive nlp tasks. Advances in Neural Information Processing Systems 33 (2020), 9459–9474.
- Lm-diskann: Low memory footprint in disk-native dynamic graph-based ann indexing. In 2023 IEEE International Conference on Big Data (BigData) (2023), IEEE, pp. 5987–5996.
- KILT: a benchmark for knowledge intensive language tasks. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Online, June 2021), Association for Computational Linguistics, pp. 2523–2544.
- Neurips’23 competition track: Big-ann, 2023.
- DiskANN: Graph-structured Indices for Scalable, Fast, Fresh and Filtered Approximate Nearest Neighbor Search, 2023.
- Freshdiskann: A fast and accurate graph-based ann index for streaming similarity search. arXiv preprint arXiv:2105.09613 (2021).
- Milvus: A purpose-built vector data management system. In Proceedings of the 2021 International Conference on Management of Data (New York, NY, USA, 2021), SIGMOD ’21, Association for Computing Machinery, p. 2614–2627.
- Text embeddings by weakly-supervised contrastive pre-training, 2022.