Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
156 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

VDTuner: Automated Performance Tuning for Vector Data Management Systems (2404.10413v1)

Published 16 Apr 2024 in cs.DB, cs.LG, and cs.PF

Abstract: Vector data management systems (VDMSs) have become an indispensable cornerstone in large-scale information retrieval and machine learning systems like LLMs. To enhance the efficiency and flexibility of similarity search, VDMS exposes many tunable index parameters and system parameters for users to specify. However, due to the inherent characteristics of VDMS, automatic performance tuning for VDMS faces several critical challenges, which cannot be well addressed by the existing auto-tuning methods. In this paper, we introduce VDTuner, a learning-based automatic performance tuning framework for VDMS, leveraging multi-objective Bayesian optimization. VDTuner overcomes the challenges associated with VDMS by efficiently exploring a complex multi-dimensional parameter space without requiring any prior knowledge. Moreover, it is able to achieve a good balance between search speed and recall rate, delivering an optimal configuration. Extensive evaluations demonstrate that VDTuner can markedly improve VDMS performance (14.12% in search speed and 186.38% in recall rate) compared with default setting, and is more efficient compared with state-of-the-art baselines (up to 3.57 times faster in terms of tuning time). In addition, VDTuner is scalable to specific user preference and cost-aware optimization objective. VDTuner is available online at https://github.com/tiannuo-yang/VDTuner.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (42)
  1. OpenAI, “Gpt-4 technical report,” 2023.
  2. W. X. Zhao, K. Zhou, J. Li, T. Tang, X. Wang, Y. Hou, Y. Min, B. Zhang, J. Zhang, Z. Dong et al., “A survey of large language models,” arXiv preprint arXiv:2303.18223, 2023.
  3. L. Huang, W. Yu, W. Ma, W. Zhong, Z. Feng, H. Wang, Q. Chen, W. Peng, X. Feng, B. Qin et al., “A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions,” arXiv preprint arXiv:2311.05232, 2023.
  4. J. Cui, Z. Li, Y. Yan, B. Chen, and L. Yuan, “Chatlaw: Open-source legal large language model with integrated external knowledge bases,” arXiv preprint arXiv:2306.16092, 2023.
  5. Y. Han, C. Liu, and P. Wang, “A comprehensive survey on vector database: Storage and retrieval technique, challenge,” arXiv preprint arXiv:2310.11703, 2023.
  6. Y. Bang, S. Cahyawijaya, N. Lee, W. Dai, D. Su, B. Wilie, H. Lovenia, Z. Ji, T. Yu, W. Chung et al., “A multitask, multilingual, multimodal evaluation of chatgpt on reasoning, hallucination, and interactivity,” arXiv preprint arXiv:2302.04023, 2023.
  7. J. Wang, X. Yi, R. Guo, H. Jin, P. Xu, S. Li, X. Wang, X. Guo, C. Li, X. Xu et al., “Milvus: A purpose-built vector data management system,” in Proceedings of the 2021 International Conference on Management of Data, 2021, pp. 2614–2627.
  8. R. Guo, X. Luan, L. Xiang, X. Yan, X. Yi, J. Luo, Q. Cheng, W. Xu, J. Luo, F. Liu et al., “Manu: a cloud native vector database management system,” Proceedings of the VLDB Endowment, vol. 15, no. 12, pp. 3548–3561, 2022.
  9. “Qdrant: Powering the next generation of ai applications with advanced and high-performant vector similarity search technology,” 2023. [Online]. Available: https://qdrant.tech/
  10. “Chroma: the ai-native open-source embedding database,” 2023. [Online]. Available: https://www.trychroma.com/
  11. D. Van Aken, A. Pavlo, G. J. Gordon, and B. Zhang, “Automatic database management system tuning through large-scale machine learning,” in Proceedings of the 2017 ACM International Conference on Management of Data, 2017, pp. 1009–1024.
  12. J. Zhang, Y. Liu, K. Zhou, G. Li, Z. Xiao, B. Cheng, J. Xing, Y. Wang, T. Cheng, L. Liu et al., “An end-to-end automatic cloud database tuning system using deep reinforcement learning,” in Proceedings of the 2019 International Conference on Management of Data, 2019, pp. 415–432.
  13. I. Trummer, “Db-bert: a database tuning tool that” reads the manual”,” in Proceedings of the 2022 International Conference on Management of Data, 2022, pp. 190–203.
  14. X. Zhang, Z. Chang, H. Wu, Y. Li, J. Chen, J. Tan, F. Li, and B. Cui, “A unified and efficient coordinating framework for autonomous dbms tuning,” Proceedings of the ACM on Management of Data, vol. 1, no. 2, pp. 1–26, 2023.
  15. “Pgtune,” 2023. [Online]. Available: https://pgtune.leopard.in.ua/
  16. G. Li, X. Zhou, S. Li, and B. Gao, “Qtune: A query-aware database tuning system with deep reinforcement learning,” Proceedings of the VLDB Endowment, vol. 12, no. 12, pp. 2118–2130, 2019.
  17. S. Cereda, S. Valladares, P. Cremonesi, and S. Doni, “Cgptuner: a contextual gaussian process bandit approach for the automatic tuning of it configurations under varying workload conditions,” Proceedings of the VLDB Endowment, vol. 14, no. 8, pp. 1401–1413, 2021.
  18. C. Zhao, T. Chugh, J. Min, M. Liu, and A. Krishnamurthy, “Dremel: Adaptive configuration tuning of rocksdb kv-store,” Proceedings of the ACM on Measurement and Analysis of Computing Systems, vol. 6, no. 2, pp. 1–30, 2022.
  19. X. Zhao, X. Zhou, and G. Li, “Automatic database knob tuning: A survey,” IEEE Transactions on Knowledge and Data Engineering, 2023.
  20. J. Ansel, S. Kamil, K. Veeramachaneni, J. Ragan-Kelley, J. Bosboom, U.-M. O’Reilly, and S. Amarasinghe, “Opentuner: An extensible framework for program autotuning,” in Proceedings of the 23rd International Conference on Parallel Architectures and Compilation, 2014, pp. 303–316.
  21. Y. Zhu, J. Liu, M. Guo, Y. Bao, W. Ma, Z. Liu, K. Song, and Y. Yang, “Bestconfig: tapping the performance potential of systems via automatic configuration tuning,” in Proceedings of the 2017 Symposium on Cloud Computing, 2017, pp. 338–350.
  22. X. Zhang, H. Wu, Z. Chang, S. Jin, J. Tan, F. Li, T. Zhang, and B. Cui, “Restune: Resource oriented tuning boosted by meta-learning for cloud databases,” in Proceedings of the 2021 International Conference on Management of Data, 2021, pp. 2102–2114.
  23. K. Yang, M. Emmerich, A. Deutz, and T. Bäck, “Multi-objective bayesian global optimization using expected hypervolume improvement gradient,” Swarm and Evolutionary Computation, vol. 44, pp. 945–956, 2019.
  24. S. Daulton, M. Balandat, and E. Bakshy, “Differentiable expected hypervolume improvement for parallel multi-objective bayesian optimization,” Advances in Neural Information Processing Systems, vol. 33, pp. 9851–9864, 2020.
  25. “Milvus,” 2023. [Online]. Available: https://github.com/milvus-io/milvus
  26. W. Li, Y. Zhang, Y. Sun, W. Wang, M. Li, W. Zhang, and X. Lin, “Approximate nearest neighbor search on high dimensional data—experiments, analyses, and improvement,” IEEE Transactions on Knowledge and Data Engineering, vol. 32, no. 8, pp. 1475–1488, 2019.
  27. V. Dalibard, M. Schaarschmidt, and E. Yoneki, “Boat: Building auto-tuners with structured bayesian optimization,” in Proceedings of the 26th International Conference on World Wide Web, 2017, pp. 479–488.
  28. J. Wang, I. Trummer, and D. Basu, “Udo: universal database optimization using reinforcement learning,” Proceedings of the VLDB Endowment, vol. 14, no. 13, pp. 3402–3414, 2021.
  29. J.-K. Ge, Y.-F. Chai, and Y.-P. Chai, “Watuning: a workload-aware tuning system with attention-based deep reinforcement learning,” Journal of Computer Science and Technology, vol. 36, no. 4, pp. 741–761, 2021.
  30. B. Shahriari, K. Swersky, Z. Wang, R. P. Adams, and N. De Freitas, “Taking the human out of the loop: A review of bayesian optimization,” Proceedings of the IEEE, vol. 104, no. 1, pp. 148–175, 2015.
  31. “Vector-db-benchmark: Framework for benchmarking vector search engines,” 2023. [Online]. Available: https://github.com/qdrant/vector-db-benchmark
  32. J. Bergstra and Y. Bengio, “Random search for hyper-parameter optimization.” Journal of Machine Learning Research, vol. 13, no. 2, 2012.
  33. W.-L. Loh, “On latin hypercube sampling,” The Annals of Statistics, vol. 24, no. 5, pp. 2058–2080, 1996.
  34. D. Van Aken, D. Yang, S. Brillard, A. Fiorino, B. Zhang, C. Bilien, and A. Pavlo, “An inquiry into machine learning-based automatic configuration tuning services on real-world database management systems,” Proceedings of the VLDB Endowment, vol. 14, no. 7, pp. 1241–1253, 2021.
  35. X. Zhang, Z. Chang, Y. Li, H. Wu, J. Tan, F. Li, and B. Cui, “Facilitating database tuning with hyper-parameter optimization: a comprehensive experimental evaluation,” Proceedings of the VLDB Endowment, vol. 15, no. 9, pp. 1808–1821, 2022.
  36. S. M. Lundberg and S.-I. Lee, “A unified approach to interpreting model predictions,” Advances in Neural Information Processing Systems, vol. 30, 2017.
  37. J. Johnson, M. Douze, and H. Jégou, “Billion-scale similarity search with gpus,” IEEE Transactions on Big Data, vol. 7, no. 3, pp. 535–547, 2019.
  38. Y. A. Malkov and D. A. Yashunin, “Efficient and robust approximate nearest neighbor search using hierarchical navigable small world graphs,” IEEE Transactions on Pattern Analysis and Machine Iintelligence, vol. 42, no. 4, pp. 824–836, 2018.
  39. “Annoy: Approximate nearest neighbors oh yeah,” 2023. [Online]. Available: https://github.com/spotify/annoy
  40. J. Xin, K. Hwang, and Z. Yu, “Locat: Low-overhead online configuration auto-tuning of spark sql applications,” in Proceedings of the 2022 International Conference on Management of Data, 2022, pp. 674–684.
  41. C. Lin, J. Zhuang, J. Feng, H. Li, X. Zhou, and G. Li, “Adaptive code learning for spark configuration tuning,” in 2022 IEEE 38th International Conference on Data Engineering (ICDE).   IEEE, 2022, pp. 1995–2007.
  42. H. Herodotou, L. Odysseos, Y. Chen, and J. Lu, “Automatic performance tuning for distributed data stream processing systems,” in 2022 IEEE 38th International Conference on Data Engineering (ICDE).   IEEE, 2022, pp. 3194–3197.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com