Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 150 tok/s
Gemini 2.5 Pro 47 tok/s Pro
GPT-5 Medium 33 tok/s Pro
GPT-5 High 34 tok/s Pro
GPT-4o 113 tok/s Pro
Kimi K2 211 tok/s Pro
GPT OSS 120B 444 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

ContTune: Continuous Tuning by Conservative Bayesian Optimization for Distributed Stream Data Processing Systems (2309.12239v1)

Published 21 Sep 2023 in cs.DB

Abstract: The past decade has seen rapid growth of distributed stream data processing systems. Under these systems, a stream application is realized as a Directed Acyclic Graph (DAG) of operators, where the level of parallelism of each operator has a substantial impact on its overall performance. However, finding optimal levels of parallelism remains challenging. Most existing methods are heavily coupled with the topological graph of operators, unable to efficiently tune under-provisioned jobs. They either insufficiently use previous tuning experience by treating successively tuning independently, or explore the configuration space aggressively, violating the Service Level Agreements (SLA). To address the above problems, we propose ContTune, a continuous tuning system for stream applications. It is equipped with a novel Big-small algorithm, in which the Big phase decouples the tuning from the topological graph by decomposing the job tuning problem into sub-problems that can be solved concurrently. We propose a conservative Bayesian Optimization (CBO) technique in the Small phase to speed up the tuning process by utilizing the previous observations. It leverages the state-of-the-art (SOTA) tuning method as conservative exploration to avoid SLA violations. Experimental results show that ContTune reduces up to 60.75% number of reconfigurations under synthetic workloads and up to 57.5% number of reconfigurations under real workloads, compared to the SOTA method DS2.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (80)
  1. [n.d.]. ApacheBeamNexmarkbenchmarksuite. https://beam.apache.org/documentation/sdks/java/nexmark.
  2. [n.d.]. ClickHouseDatabase. https://github.com/ClickHouse/ClickHouse.
  3. [n.d.]. Flink1.13DocsOpsMetrics. https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/ops/metrics/.
  4. [n.d.]. MysqlDatabase. https://www.mysql.com/.
  5. [n.d.]. Network buffer. https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/deployment/memory/network_mem_tuning/.
  6. [n.d.]. NEXMark benchmark. http://datalab.cs.pdx.edu/niagaraST/NEXMark.
  7. 2019. Esper. Retrieved from http://www.espertech.com/.
  8. Distributed operation in the borealis stream processing engine. In Proceedings of the 2005 ACM SIGMOD international conference on Management of data. 882–884.
  9. Automatic Database Management System Tuning Through Large-scale Machine Learning. In SIGMOD Conference. ACM, 1009–1024.
  10. Algorithms for Hyper-Parameter Optimization. In NIPS. 2546–2554.
  11. An experimental comparison of Bayesian optimization for bipedal locomotion. In ICRA. IEEE, 1951–1958.
  12. Apache flink: Stream and batch processing in a single engine. Bulletin of the IEEE Computer Society Technical Committee on Data Engineering 36, 4 (2015).
  13. Runtime Adaptation of Data Stream Processing Systems: The State of the Art. ACM Computing Surveys (CSUR) 54, 11s (2022), 1–36.
  14. Integrating scale out and fault tolerance in stream processing using operator state management. In Proceedings of the 2013 ACM SIGMOD international conference on Management of data. 725–736.
  15. CGPTuner: a Contextual Gaussian Process Bandit Approach for the Automatic Tuning of IT Configurations Under Varying Workload Conditions. Proc. VLDB Endow. 14, 8 (2021), 1401–1413.
  16. GOVERNOR: Smoother stream processing through smarter backpressure. In 2017 IEEE International Conference on Autonomic Computing (ICAC). IEEE, 145–154.
  17. Rhino: Efficient management of very large distributed state for stream processing engines. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. 2471–2486.
  18. S. Desimone. 2017. Storage reimagined for a streamingworld. https://pravega.io/blog/2017/04/09/storage-reimagined-for-a-streaming-world/.
  19. Tuning Database Configuration Parameters with iTuned. Proc. VLDB Endow. 2, 1 (2009), 1246–1257.
  20. Klink: Progress-Aware Scheduling for Streaming Data Systems. In Proceedings of the 2021 International Conference on Management of Data. 485–498.
  21. Safe Contextual Bayesian Optimization for Sustainable Room Temperature PID Control Tuning. In IJCAI. ijcai.org, 5850–5856.
  22. Machines tuning machines: Configuring distributed stream processors with bayesian optimization. In 2015 IEEE International conference on cluster computing. IEEE, 22–31.
  23. Machines Tuning Machines: Configuring Distributed Stream Processors with Bayesian Optimization. In CLUSTER. IEEE Computer Society, 22–31.
  24. Dhalion: self-regulating stream processing in heron. Proceedings of the VLDB Endowment 10, 12 (2017), 1825–1836.
  25. Variational Bayesian Optimal Experimental Design. In NeurIPS. 14036–14047.
  26. DRS: Auto-scaling for real-time stream analytics. IEEE/ACM Transactions on networking 25, 6 (2017), 3338–3352.
  27. Pipelined fission for stream programs with dynamic selectivity and partitioned state. J. Parallel and Distrib. Comput. 96 (2016), 106–120.
  28. Elastic scaling for data stream processing. IEEE Transactions on Parallel and Distributed Systems 25, 6 (2013), 1447–1463.
  29. Bayesian Optimization with Unknown Constraints. In UAI. AUAI Press, 250–259.
  30. Lukasz Golab and M Tamer Özsu. 2003. Issues in data stream management. ACM Sigmod Record 32, 2 (2003), 5–14.
  31. PRESS: PRedictive Elastic ReSource Scaling for cloud systems. 2010 International Conference on Network and Service Management (2010), 9–16.
  32. Streamcloud: An elastic and scalable data streaming system. IEEE Transactions on Parallel and Distributed Systems 23, 12 (2012), 2351–2365.
  33. Latency-aware elastic scaling for distributed data stream processing systems. In Proceedings of the 8th ACM International Conference on Distributed Event-Based Systems. 13–22.
  34. SnailTrail: Generalizing Critical Paths for Online Analysis of Distributed Dataflows. In NSDI. USENIX Association, 95–110.
  35. Megaphone: Latency-conscious state migration for distributed streaming dataflows. Proceedings of the VLDB Endowment 12, 9 (2019), 1002–1015.
  36. Automatic Machine Learning: Methods, Systems, Challenges. Springer.
  37. Three steps is all you need: fast, accurate, automatic scaling decisions for distributed streaming dataflows. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18). 783–798.
  38. A holistic view of stream partitioning costs. Proceedings of the VLDB Endowment 10, 11 (2017), 1286–1297.
  39. Saber: Window-based hybrid stream processing for heterogeneous architectures. In Proceedings of the 2016 International Conference on Management of Data. 555–569.
  40. Andreas Krause and Cheng Soon Ong. 2011. Contextual Gaussian Process Bandit Optimization. In NIPS. 2447–2455.
  41. Twitter heron: Stream processing at scale. In Proceedings of the 2015 ACM SIGMOD international conference on Management of data. 239–250.
  42. Mayuresh Kunjir and Shivnath Babu. 2020. Black or White? How to Develop an AutoTuner for Memory-based Analytics. In SIGMOD Conference. ACM, 1667–1683.
  43. Elastic Machine Learning Algorithms in Amazon SageMaker. In SIGMOD Conference. ACM, 731–737.
  44. Online Resource Optimization for Elastic Stream Processing with Regret Guarantee. Proceedings of the 51st International Conference on Parallel Processing (2022).
  45. Elastic stream processing with latency guarantees. In 2015 IEEE 35th International Conference on Distributed Computing Systems. IEEE, 399–410.
  46. Elastic symbiotic scaling of operators and resources in stream processing systems. IEEE Transactions on Parallel and Distributed Systems 29, 3 (2017), 572–585.
  47. Chi: A scalable and programmable control plane for distributed stream processing systems. Proceedings of the VLDB Endowment 11, 10 (2018), 1303–1316.
  48. Trisk: Task-Centric Data Stream Reconfiguration. In Proceedings of the ACM Symposium on Cloud Computing. 214–228.
  49. Virtual vs. real: Trading off simulations and physical experiments in reinforcement learning with Bayesian optimization. In ICRA. IEEE, 1557–1563.
  50. Graphcep: Real-time data analytics using parallel complex event and graph processing. In Proceedings of the 10th ACM International Conference on Distributed and Event-based Systems. 309–316.
  51. SPECTRE: Supporting consumption policies in window-based parallel complex event processing. In Proceedings of the 18th ACM/IFIP/USENIX Middleware Conference. 161–173.
  52. Minimizing communication overhead in window-based parallel complex event processing. In Proceedings of the 11th ACM International Conference on Distributed and Event-based Systems. 54–65.
  53. Turbine: Facebook’s service management platform for stream processing. In 2020 IEEE 36th International Conference on Data Engineering (ICDE). IEEE, 1591–1602.
  54. Elastic-PPQ: A two-level autonomic system for spatial preference query processing over dynamic data streams. Future Generation Computer Systems 79 (2018), 862–877.
  55. Parallel continuous preference queries over out-of-order and bursty data streams. IEEE Transactions on Parallel and Distributed Systems 28, 9 (2017), 2608–2624.
  56. Harnessing sliding-window execution semantics for parallel stream processing. J. Parallel and Distrib. Comput. 116 (2018), 74–88.
  57. The power of both choices: Practical load balancing for distributed stream processing engines. In 2015 IEEE 31st International Conference on Data Engineering. IEEE, 137–148.
  58. When two choices are not enough: Balancing at scale in distributed stream processing. In 2016 IEEE 32nd International Conference on Data Engineering (ICDE). IEEE, 589–600.
  59. Regret for Expected Improvement over the Best-Observed Value and Stopping Condition. In ACML (Proceedings of Machine Learning Research), Vol. 77. PMLR, 279–294.
  60. Samza: stateful scalable stream processing at LinkedIn. Proceedings of the VLDB Endowment 10, 12 (2017), 1634–1645.
  61. Online scheduling for shuffle grouping in distributed stream processing systems. In Proceedings of the 17th International Middleware Conference. 1–12.
  62. Henriette Röger and Ruben Mayer. 2019. A comprehensive survey on parallelization and elasticity in stream processing. ACM Computing Surveys (CSUR) 52, 2 (2019), 1–37.
  63. Partitioning for scalable complex event processing on data streams. In New Trends in Database and Information Systems II: Selected papers of the 18th East European Conference on Advances in Databases and Information Systems and Associated Satellite Events, ADBIS 2014 Ohrid, Macedonia, September 7-10, 2014 Proceedings II. Springer, 185–197.
  64. Dynamic load balancing for ordered data-parallel regions in distributed streaming systems. In Proceedings of the 17th International Middleware Conference. 1–14.
  65. Gaussian process optimization in the bandit setting: No regret and experimental design. arXiv preprint arXiv:0912.3995 (2009).
  66. Gaussian Process Optimization in the Bandit Setting: No Regret and Experimental Design. In ICML. Omnipress, 1015–1022.
  67. Storm@ twitter. In Proceedings of the 2014 ACM SIGMOD international conference on Management of data. 147–156.
  68. Load shedding in stream databases: a control-based approach. (2006).
  69. Nexmark–a benchmark for queries over data streams (draft). Technical Report. Technical Report. Technical report, OGI School of Science & Engineering at ….
  70. Sequential Model-Free Hyperparameter Tuning. In ICDM. IEEE Computer Society, 1033–1038.
  71. Hyperparameter optimization for machine learning models based on Bayesian optimization. Journal of Electronic Science and Technology 17, 1 (2019), 26–40.
  72. Dynamic index construction with deep reinforcement learning. Data Science and Engineering 7, 2 (2022), 87–101.
  73. Yingjun Wu and Kian-Lee Tan. 2015. ChronoStream: Elastic stateful stream computation in the cloud. In 2015 IEEE 31st International Conference on Data Engineering. IEEE, 723–734.
  74. Stela: Enabling stream processing systems to scale-in and scale-out on-demand. In 2016 IEEE International Conference on Cloud Engineering (IC2E). IEEE, 22–31.
  75. An Adaptive Elastic Multi-model Big Data Analysis and Information Extraction System. Data Science and Engineering 7, 4 (2022), 328–338.
  76. Dynamic load balancing techniques for distributed complex event processing systems. In Distributed Applications and Interoperable Systems: 16th IFIP WG 6.1 International Conference, DAIS 2016, Held as Part of the 11th International Federated Conference on Distributed Computing Techniques, DisCoTec 2016, Heraklion, Crete, Greece, June 6-9, 2016, Proceedings 16. Springer, 174–188.
  77. Discretized streams: Fault-tolerant streaming computation at scale. In Proceedings of the twenty-fourth ACM symposium on operating systems principles. 423–438.
  78. Facilitating Database Tuning with Hyper-Parameter Optimization: A Comprehensive Experimental Evaluation. Proc. VLDB Endow. 15, 9 (2022), 1808–1821.
  79. ResTune: Resource Oriented Tuning Boosted by Meta-Learning for Cloud Databases. In SIGMOD Conference. ACM, 2102–2114.
  80. Towards Dynamic and Safe Configuration Tuning for Cloud Databases. In SIGMOD Conference. ACM, 631–645.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube