Hydro: Adaptive Query Processing of ML Queries (2403.14902v1)
Abstract: Query optimization in relational database management systems (DBMSs) is critical for fast query processing. The query optimizer relies on precise selectivity and cost estimates to effectively optimize queries prior to execution. While this strategy is effective for relational DBMSs, it is not sufficient for DBMSs tailored for processing ML queries. In ML-centric DBMSs, query optimization is challenging for two reasons. First, the performance bottleneck of the queries shifts to user-defined functions (UDFs) that often wrap around deep learning models, making it difficult to accurately estimate UDF statistics without profiling the query. This leads to inaccurate statistics and sub-optimal query plans. Second, the optimal query plan for ML queries is data-dependent, necessitating DBMSs to adapt the query plan on the fly during execution. So, a static query plan is not sufficient for such queries. In this paper, we present Hydro, an ML-centric DBMS that utilizes adaptive query processing (AQP) for efficiently processing ML queries. Hydro is designed to quickly evaluate UDF-based query predicates by ensuring optimal predicate evaluation order and improving the scalability of UDF execution. By integrating AQP, Hydro continuously monitors UDF statistics, routes data to predicates in an optimal order, and dynamically allocates resources for evaluating predicates. We demonstrate Hydro's efficacy through four illustrative use cases, delivering up to 11.52x speedup over a baseline system.
- GPT4All: Training an Assistant-style Chatbot with Large Scale Data Distillation from GPT-3.5-Turbo. https://github.com/nomic-ai/gpt4all.
- Physical Representation-Based Predicate Optimization for a Visual Analytics Database. In ICDE. 1466–1477.
- Ron Avnur and Joseph M Hellerstein. 2000. Eddies: Continuously Adaptive Query Processing. SIGMOD (2000), 12.
- Proactive re-optimization. In SIGMOD. 107–118.
- Seiden: Revisiting Query Processing in Video Database Systems. PVLDB (2023), 2289–2301.
- MIRIS: Fast Object Track Queries in Video. In SIGMOD. 1907––1921.
- Content-Based Routing: Different Plans for Different Data. PVLDB (2005).
- FiGO: Fine-Grained Query Optimization in Video Analytics. In SIGMOD. 559–572.
- Spatial and Temporal Constrained Ranked Retrieval over Videos. PVLDB (2022), 3226–3239.
- Ranked Window Query Retrieval over Video Repositories. In ICDE. 2776–2791.
- Evaluating Temporal Queries Over Video Feeds. In SIGMOD. 287–299.
- Serving Heterogeneous Machine Learning Models on Multi-GPU Servers with Spatio-Temporal Sharing. In ATC. 199–216.
- Zeus: Efficiently Localizing Actions in Videos using Reinforcement Learning. In SIGMOD. 545–558.
- InferLine: latency-aware provisioning and scaling for prediction serving pipelines. In SoCC. 477–491.
- Clipper: A Low-Latency Online Prediction Serving System. (2017), 613–627.
- Adaptive Query Processing. Foundations and Trends in Databases (2007), 1–140.
- GSLICE: controlled spatial sharing of GPUs for a scalable inference platform. In SoCC. 492–506.
- An Image Is Worth 16X16 Words: Transformers For Image Recognition At Scale. ICLR (2021).
- Jocher Glenn. 2020. YOLOv5 by Ultralytics. https://github.com/ultralytics/yolov5
- G. Graefe. 1995. The Cascades Framework for Query Optimization. IEEE Data Eng. Bull. (1995), 19–29.
- Serving DNNs like Clockwork: Performance Predictability from the Bottom Up. (2020), 443–462.
- Cocktail: A Multidimensional Optimization for Model Serving in Cloud. In NSDI. 1041–1057.
- Joseph M Hellerstein. 1994. Practical predicate placement. In SIGMOD. 325–335.
- Focus: Querying Large Video Datasets with Low Latency and Low Cost. In OSDI. 269–286.
- Scrooge: A Cost-Effective Deep Learning Inference System. In SoCC. 624–638.
- Dynamic Space-Time Scheduling for GPU Inference. http://arxiv.org/abs/1901.00041
- Chameleon: scalable adaptation of video analytics. In SIGCOMM. 253––266.
- Navin Kabra and David J Dewitt. 1998. Mid-Query Re-Optimization of Sub-Optimal Execution Plans. SIGMOD (1998).
- EVA: An End-to-End Exploratory Video Analytics System. In DEEM. 1–5.
- BlazeIt: Optimizing Declarative Aggregation and Limit Queries for Neural Network-Based Video Analytics. In VLDB. 533––546.
- NoScope: Optimizing Deep CNN-Based Queries over Video Streams at Scale. In VLDB. 1586––1597.
- Approximate Selection with Guarantees using Proxies. PVLDB (2020), 1990–2003.
- Accelerating Approximate Aggregation Queries with Expensive Predicates. PVLDB (2021), 2341–2354.
- TASTI: Semantic Indexes for Machine Learning-based Queries over Unstructured Data. In SIGMOD. 1934–1947.
- Extract-Transform-Load for Video Streams. PVLDB (2023), 2302–2315.
- Video Monitoring Queries. In ICDE. 1285–1296.
- Top-K Deep Video Analytics: A Probabilistic Approach. In SIGMOD. 1037–1050.
- Accelerating Machine Learning Inference with Probabilistic Predicates. In SIGMOD. 1493––1508.
- Themis: Fair and Efficient GPU Cluster Scheduling. In NSDI. 289–304.
- ExSample: Efficient Searches on Video Repositories through Adaptive Sampling. arXiv:2005.09141 [cs] (2020).
- NVIDIA. 2020. TensorRT Inference Server. [Online] Available from: https://github.com/NVIDIA/tensorrt-inference-server.
- NVIDIA. 2021. NVIDIA Multi-Process Service Introduction. [Online] Available from: https://docs.nvidia.com/deploy/mps/index.html.
- TensorFlow-Serving: Flexible, High-Performance ML Serving. http://arxiv.org/abs/1712.06139
- Optimizing Video Analytics with Declarative Model Relationships. PVLDB (2022), 447–460.
- INFaaS: Automated Model-less Inference Serving. (2021), 397–411.
- Nexus: a GPU cluster engine for accelerating DNN-based video analysis. In SOSP. 322–337.
- Ultralytics. 2023. Ultralytics YOLOv8. [Online] Available from: https://github.com/ultralytics/ultralytics.
- Carl A Waldspurger and William E Weihl. 1994. Lottery Scheduling: Flexible Proportional-Share Resource Management. OSDI (1994).
- Serving and Optimizing Machine Learning Workflows on Heterogeneous Infrastructures. PVLDB (2022), 406–419.
- Gandiva: Introspective Cluster Scheduling for Deep Learning. In OSDI. 595–610.
- EVA: A Symbolic Approach to Accelerating Exploratory Video Analytics with Materialized Views. In SIGMOD. 602–616.
- Optimizing Machine Learning Inference Queries with Correlative Proxy Models. PVLDB (2022), 2032–2044.
- Skylar Yau. 2023. Dog Breed Classifier ViT. https://huggingface.co/skyau/dog-breed-classifier-vit
- Peifeng Yu and Mosharaf Chowdhury. 2020. Salus: Fine-Grained GPU Sharing Primitives for Deep Learning Applications. In MLSys.
- EQUI-VOCAL: Synthesizing Queries for Compositional Video Events from Limited User Interactions. PVLDB (2023), 2714–2727.
- G-NET: Effective GPU Sharing in NFV Systems. In NSDI. 187–200.
- Yuhao Zhang and Arun Kumar. 2019. Panorama: A Data System for Unbounded Vocabulary Querying over Video. In VLDB. 477–491.