COSTREAM: Learned Cost Models for Operator Placement in Edge-Cloud Environments (2403.08444v1)
Abstract: In this work, we present COSTREAM, a novel learned cost model for Distributed Stream Processing Systems that provides accurate predictions of the execution costs of a streaming query in an edge-cloud environment. The cost model can be used to find an initial placement of operators across heterogeneous hardware, which is particularly important in these environments. In our evaluation, we demonstrate that COSTREAM can produce highly accurate cost estimates for the initial operator placement and even generalize to unseen placements, queries, and hardware. When using COSTREAM to optimize the placements of streaming operators, a median speed-up of around 21x can be achieved compared to baselines.
- L. Aniello, R. Baldoni, and L. Querzoni, “Adaptive online scheduling in storm,” in The 7th ACM International Conference on Distributed Event-Based Systems, DEBS ’13, Arlington, TX, USA - June 29 - July 03, 2013, S. Chakravarthy, S. D. Urban, P. R. Pietzuch, and E. A. Rundensteiner, Eds. ACM, 2013, pp. 207–218. [Online]. Available: https://doi.org/10.1145/2488222.2488267
- S. Imai, S. Patterson, and C. A. Varela, “Maximum sustainable throughput prediction for data stream processing over public clouds,” in Proceedings of the 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, CCGRID 2017, Madrid, Spain, May 14-17, 2017. IEEE Computer Society / ACM, 2017, pp. 504–513. [Online]. Available: https://doi.org/10.1109/CCGRID.2017.105
- B. Chandramouli, J. Goldstein, R. S. Barga, M. Riedewald, and I. Santos, “Accurate latency estimation in a distributed event processing system,” in Proceedings of the 27th International Conference on Data Engineering, ICDE 2011, April 11-16, 2011, Hannover, Germany, S. Abiteboul, K. Böhm, C. Koch, and K. Tan, Eds. IEEE Computer Society, 2011, pp. 255–266. [Online]. Available: https://doi.org/10.1109/ICDE.2011.5767926
- R. Heinrich, M. Luthra, H. Kornmayer, and C. Binnig, “Zero-shot cost models for distributed stream processing,” in 16th ACM International Conference on Distributed and Event-based Systems, DEBS 2022, Copenhagen, Denmark, June 27 - 30, 2022, Y. Zhou, P. K. Chrysanthis, V. Gulisano, and E. T. Zacharatou, Eds. ACM, 2022, pp. 85–90. [Online]. Available: https://doi.org/10.1145/3524860.3539639
- P. R. Pietzuch, J. Ledlie, J. Shneidman, M. Roussopoulos, M. Welsh, and M. I. Seltzer, “Network-aware operator placement for stream-processing systems,” in Proceedings of the 22nd International Conference on Data Engineering, ICDE 2006, 3-8 April 2006, Atlanta, GA, USA, L. Liu, A. Reuter, K. Whang, and J. Zhang, Eds. IEEE Computer Society, 2006, p. 49. [Online]. Available: https://doi.org/10.1109/ICDE.2006.105
- C. Wang, X. Meng, Q. Guo, Z. Weng, and C. Yang, “Automating characterization deployment in distributed data stream management systems,” IEEE Trans. Knowl. Data Eng., vol. 29, no. 12, pp. 2669–2681, 2017. [Online]. Available: https://doi.org/10.1109/TKDE.2017.2751606
- V. Cardellini, V. Grassi, F. L. Presti, and M. Nardelli, “Optimal operator placement for distributed stream processing applications,” in Proceedings of the 10th ACM International Conference on Distributed and Event-based Systems, DEBS ’16, Irvine, CA, USA, June 20 - 24, 2016, A. Gal, M. Weidlich, V. Kalogeraki, and N. Venkasubramanian, Eds. ACM, 2016, pp. 69–80. [Online]. Available: https://doi.org/10.1145/2933267.2933312
- M. Nardelli, V. Cardellini, V. Grassi, and F. L. Presti, “Efficient operator placement for distributed data stream processing applications,” IEEE Trans. Parallel Distributed Syst., vol. 30, no. 8, pp. 1753–1767, 2019. [Online]. Available: https://doi.org/10.1109/TPDS.2019.2896115
- M. Luthra, B. Koldehofe, N. Danger, P. Weisenburger, G. Salvaneschi, and I. Stavrakakis, “TCEP: transitions in operator placement to adapt to dynamic network environments,” J. Comput. Syst. Sci., vol. 122, pp. 94–125, 2021. [Online]. Available: https://doi.org/10.1016/j.jcss.2021.05.003
- F. Liu, W. Zhu, W. Mu, Y. Zhang, M. Li, C. Ma, and W. Wang, “Online runtime environment prediction for complex colocation interference in distributed streaming processing,” in Computational Science - ICCS 2023 - 23rd International Conference, Prague, Czech Republic, July 3-5, 2023, Proceedings, Part II, ser. Lecture Notes in Computer Science, J. Mikyska, C. de Mulatier, M. Paszynski, V. V. Krzhizhanovskaya, J. J. Dongarra, and P. M. A. Sloot, Eds., vol. 14074. Springer, 2023, pp. 93–107. [Online]. Available: https://doi.org/10.1007/978-3-031-36021-3_7
- L. Eskandari, J. Mair, Z. Huang, and D. M. Eyers, “I-scheduler: Iterative scheduling for distributed stream processing systems,” Future Gener. Comput. Syst., vol. 117, pp. 219–233, 2021. [Online]. Available: https://doi.org/10.1016/j.future.2020.11.011
- X. Ni, J. Li, M. Yu, W. Zhou, and K. Wu, “Generalizable resource allocation in stream processing via deep reinforcement learning,” in The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY, USA, February 7-12, 2020. AAAI Press, 2020, pp. 857–864. [Online]. Available: https://doi.org/10.1609/aaai.v34i01.5431
- J. Xu, Z. Chen, J. Tang, and S. Su, “T-storm: Traffic-aware online scheduling in storm,” in IEEE 34th International Conference on Distributed Computing Systems, ICDCS 2014, Madrid, Spain, June 30 - July 3, 2014. IEEE Computer Society, 2014, pp. 535–544. [Online]. Available: https://doi.org/10.1109/ICDCS.2014.61
- B. Hilprecht and C. Binnig, “Zero-shot cost models for out-of-the-box learned cost prediction,” Proc. VLDB Endow., vol. 15, no. 11, pp. 2361–2374, 2022. [Online]. Available: https://www.vldb.org/pvldb/vol15/p2361-hilprecht.pdf
- A. Ganapathi, H. A. Kuno, U. Dayal, J. L. Wiener, A. Fox, M. I. Jordan, and D. A. Patterson, “Predicting multiple metrics for queries: Better decisions enabled by machine learning,” in Proceedings of the 25th International Conference on Data Engineering, ICDE 2009, March 29 2009 - April 2 2009, Shanghai, China, Y. E. Ioannidis, D. L. Lee, and R. T. Ng, Eds. IEEE Computer Society, 2009, pp. 592–603. [Online]. Available: https://doi.org/10.1109/ICDE.2009.130
- T. Li, Z. Xu, J. Tang, and Y. Wang, “Model-free control for distributed stream data processing using deep reinforcement learning,” Proc. VLDB Endow., vol. 11, no. 6, pp. 705–718, 2018. [Online]. Available: http://www.vldb.org/pvldb/vol11/p705-li.pdf
- H. Mao, M. Schwarzkopf, S. B. Venkatakrishnan, Z. Meng, and M. Alizadeh, “Learning scheduling algorithms for data processing clusters,” in Proceedings of the ACM Special Interest Group on Data Communication, SIGCOMM 2019, Beijing, China, August 19-23, 2019, J. Wu and W. Hall, Eds. ACM, 2019, pp. 270–288. [Online]. Available: https://doi.org/10.1145/3341302.3342080
- M. Hirzel, R. Soulé, S. Schneider, B. Gedik, and R. Grimm, “A catalog of stream processing optimizations,” ACM Comput. Surv., vol. 46, no. 4, pp. 46:1–46:34, 2013. [Online]. Available: https://doi.org/10.1145/2528412
- P. Agnihotri, B. Koldehofe, C. Binnig, and M. Luthra, “Zero-shot cost models for parallel stream processing,” in Proceedings of the Sixth International Workshop on Exploiting Artificial Intelligence Techniques for Data Management, aiDM@SIGMOD 2023, Seattle, WA, USA, 18 June 2023, R. Bordawekar, O. Shmueli, Y. Amsterdamer, D. Firmani, and A. Kipf, Eds. ACM, 2023, pp. 5:1–5:5. [Online]. Available: https://doi.org/10.1145/3593078.3593934
- V. Leis, A. Gubichev, A. Mirchev, P. A. Boncz, A. Kemper, and T. Neumann, “How good are query optimizers, really?” Proc. VLDB Endow., vol. 9, no. 3, pp. 204–215, 2015. [Online]. Available: http://www.vldb.org/pvldb/vol9/p204-leis.pdf
- T. Siddiqui, A. Jindal, S. Qiao, H. Patel, and W. Le, “Cost models for big data query processing: Learning, retrofitting, and our findings,” in Proceedings of the 2020 International Conference on Management of Data, SIGMOD Conference 2020, online conference [Portland, OR, USA], June 14-19, 2020, D. Maier, R. Pottinger, A. Doan, W. Tan, A. Alawini, and H. Q. Ngo, Eds. ACM, 2020, pp. 99–113. [Online]. Available: https://doi.org/10.1145/3318464.3380584
- J. Gilmer, S. S. Schoenholz, P. F. Riley, O. Vinyals, and G. E. Dahl, “Neural message passing for quantum chemistry,” in Proceedings of the 34th International Conference on Machine Learning, ICML 2017, Sydney, NSW, Australia, 6-11 August 2017, ser. Proceedings of Machine Learning Research, D. Precup and Y. W. Teh, Eds., vol. 70. PMLR, 2017, pp. 1263–1272. [Online]. Available: http://proceedings.mlr.press/v70/gilmer17a.html
- M. Zaheer, S. Kottur, S. Ravanbakhsh, B. Póczos, R. Salakhutdinov, and A. J. Smola, “Deep sets,” in Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA, I. Guyon, U. von Luxburg, S. Bengio, H. M. Wallach, R. Fergus, S. V. N. Vishwanathan, and R. Garnett, Eds., 2017, pp. 3391–3401. [Online]. Available: https://proceedings.neurips.cc/paper/2017/hash/f22e4747da1aa27e363d86d40ff442fe-Abstract.html
- J. Karimov, T. Rabl, A. Katsifodimos, R. Samarev, H. Heiskanen, and V. Markl, “Benchmarking distributed stream data processing systems,” in 34th IEEE International Conference on Data Engineering, ICDE 2018, Paris, France, April 16-19, 2018. IEEE Computer Society, 2018, pp. 1507–1518. [Online]. Available: https://doi.org/10.1109/ICDE.2018.00169
- T. Akidau, R. Bradshaw, C. Chambers, S. Chernyak, R. Fernández-Moctezuma, R. Lax, S. McVeety, D. Mills, F. Perry, E. Schmidt, and S. Whittle, “The dataflow model: A practical approach to balancing correctness, latency, and cost in massive-scale, unbounded, out-of-order data processing,” Proc. VLDB Endow., vol. 8, no. 12, pp. 1792–1803, 2015. [Online]. Available: http://www.vldb.org/pvldb/vol8/p1792-Akidau.pdf
- X. Chen, Y. Vigfusson, D. M. Blough, F. Zheng, K. Wu, and L. Hu, “GOVERNOR: smoother stream processing through smarter backpressure,” in 2017 IEEE International Conference on Autonomic Computing, ICAC 2017, Columbus, OH, USA, July 17-21, 2017, X. Wang, C. Stewart, and H. Lei, Eds. IEEE Computer Society, 2017, pp. 145–154. [Online]. Available: https://doi.org/10.1109/ICAC.2017.31
- P. Carbone, A. Katsifodimos, S. Ewen, V. Markl, S. Haridi, and K. Tzoumas, “Apache flink™: Stream and batch processing in a single engine,” IEEE Data Eng. Bull., vol. 38, no. 4, pp. 28–38, 2015. [Online]. Available: http://sites.computer.org/debull/A15dec/p28.pdf
- A. Toshniwal, S. Taneja, A. Shukla, K. Ramasamy, J. M. Patel, S. Kulkarni, J. Jackson, K. Gade, M. Fu, J. Donham, N. Bhagat, S. Mittal, and D. V. Ryaboy, “Storm@twitter,” in International Conference on Management of Data, SIGMOD 2014, Snowbird, UT, USA, June 22-27, 2014, C. E. Dyreson, F. Li, and M. T. Özsu, Eds. ACM, 2014, pp. 147–156. [Online]. Available: https://doi.org/10.1145/2588555.2595641
- S. Kulkarni, N. Bhagat, M. Fu, V. Kedigehalli, C. Kellogg, S. Mittal, J. M. Patel, K. Ramasamy, and S. Taneja, “Twitter heron: Stream processing at scale,” in Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, Melbourne, Victoria, Australia, May 31 - June 4, 2015, T. K. Sellis, S. B. Davidson, and Z. G. Ives, Eds. ACM, 2015, pp. 239–250. [Online]. Available: https://doi.org/10.1145/2723372.2742788
- A. Dutt, C. Wang, A. Nazi, S. Kandula, V. Narasayya, and S. Chaudhuri, “Selectivity estimation for range predicates using lightweight models,” Proc. VLDB Endow., vol. 12, no. 9, p. 1044–1057, may 2019. [Online]. Available: https://doi.org/10.14778/3329772.3329780
- A. Chaudhary, S. Zeuch, and V. Markl, “Governor: Operator placement for a unified fog-cloud environment,” in Proceedings of the 23rd International Conference on Extending Database Technology, EDBT 2020, Copenhagen, Denmark, March 30 - April 02, 2020, A. Bonifati, Y. Zhou, M. A. V. Salles, A. Böhm, D. Olteanu, G. H. L. Fletcher, A. Khan, and B. Yang, Eds. OpenProceedings.org, 2020, pp. 631–634. [Online]. Available: https://doi.org/10.5441/002/edbt.2020.81
- J. Kreps, N. Narkhede, J. Rao et al., “Kafka: A distributed messaging system for log processing,” in Proceedings of the NetDB, vol. 11, no. 2011. Athens, Greece, 2011, pp. 1–7.
- G. Ke, Q. Meng, T. Finley, T. Wang, W. Chen, W. Ma, Q. Ye, and T. Liu, “Lightgbm: A highly efficient gradient boosting decision tree,” in Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA, I. Guyon, U. von Luxburg, S. Bengio, H. M. Wallach, R. Fergus, S. V. N. Vishwanathan, and R. Garnett, Eds., 2017, pp. 3146–3154. [Online]. Available: https://proceedings.neurips.cc/paper/2017/hash/6449f44a102fde848669bdd9eb6b76fa-Abstract.html
- D. Duplyakin, R. Ricci, A. Maricq, G. Wong, J. Duerig, E. Eide, L. Stoller, M. Hibler, D. Johnson, K. Webb, A. Akella, K. Wang, G. Ricart, L. Landweber, C. Elliott, M. Zink, E. Cecchet, S. Kar, and P. Mishra, “The design and operation of cloudlab,” in 2019 USENIX Annual Technical Conference, USENIX ATC 2019, Renton, WA, USA, July 10-12, 2019, D. Malkhi and D. Tsafrir, Eds. USENIX Association, 2019, pp. 1–14. [Online]. Available: https://www.usenix.org/conference/atc19/presentation/duplyakin
- M. V. Bordin, D. Griebler, G. Mencagli, C. F. R. Geyer, and L. G. L. Fernandes, “Dspbench: A suite of benchmark applications for distributed data stream processing systems,” IEEE Access, vol. 8, pp. 222 900–222 917, 2020. [Online]. Available: https://doi.org/10.1109/ACCESS.2020.3043948
- G. Hesse, C. Matthies, M. Perscheid, M. Uflacker, and H. Plattner, “Espbench: The enterprise stream processing benchmark,” in ICPE ’21: ACM/SPEC International Conference on Performance Engineering, Virtual Event, France, April 19-21, 2021, J. Bourcier, Z. M. J. Jiang, C. Bezemer, V. Cortellessa, D. D. Pompeo, and A. L. Varbanescu, Eds. ACM, 2021, pp. 201–212. [Online]. Available: https://doi.org/10.1145/3427921.3450242
- A. Shukla, S. Chaturvedi, and Y. Simmhan, “Riotbench: An iot benchmark for distributed stream processing systems,” Concurr. Comput. Pract. Exp., vol. 29, no. 21, 2017. [Online]. Available: https://doi.org/10.1002/cpe.4257
- R. Lu, G. Wu, B. Xie, and J. Hu, “Stream bench: Towards benchmarking modern distributed stream computing frameworks,” in Proceedings of the 7th IEEE/ACM International Conference on Utility and Cloud Computing, UCC 2014, London, United Kingdom, December 8-11, 2014. IEEE Computer Society, 2014, pp. 69–78. [Online]. Available: https://doi.org/10.1109/UCC.2014.15
- Z. Jerzak and H. Ziekow, “The DEBS 2014 grand challenge,” in The 8th ACM International Conference on Distributed Event-Based Systems, DEBS ’14, Mumbai, India, May 26-29, 2014, U. Bellur and R. Kothari, Eds. ACM, 2014, pp. 266–269. [Online]. Available: https://doi.org/10.1145/2611286.2611333
- A. Koliousis, M. Weidlich, R. C. Fernandez, A. L. Wolf, P. Costa, and P. R. Pietzuch, “SABER: window-based hybrid stream processing for heterogeneous architectures,” in Proceedings of the 2016 International Conference on Management of Data, SIGMOD Conference 2016, San Francisco, CA, USA, June 26 - July 01, 2016, F. Özcan, G. Koutrika, and S. Madden, Eds. ACM, 2016, pp. 555–569. [Online]. Available: https://doi.org/10.1145/2882903.2882906
- B. Peng, M. Hosseini, Z. Hong, R. Farivar, and R. H. Campbell, “R-storm: Resource-aware scheduling in storm,” in Proceedings of the 16th Annual Middleware Conference, Vancouver, BC, Canada, December 07 - 11, 2015, R. Lea, S. Gopalakrishnan, E. Tilevich, A. L. Murphy, and M. Blackstock, Eds. ACM, 2015, pp. 149–161. [Online]. Available: https://doi.org/10.1145/2814576.2814808
- A. Alexandrov, R. Bergmann, S. Ewen, J. Freytag, F. Hueske, A. Heise, O. Kao, M. Leich, U. Leser, V. Markl, F. Naumann, M. Peters, A. Rheinländer, M. J. Sax, S. Schelter, M. Höger, K. Tzoumas, and D. Warneke, “The stratosphere platform for big data analytics,” VLDB J., vol. 23, no. 6, pp. 939–964, 2014. [Online]. Available: https://doi.org/10.1007/s00778-014-0357-y
- D. Foroni, C. Axenie, S. Bortoli, M. A. H. Hassan, R. Acker, R. Tudoran, G. Brasche, and Y. Velegrakis, “Moira: A goal-oriented incremental machine learning approach to dynamic resource cost estimation in distributed stream processing systems,” in Proceedings of the International Workshop on Real-Time Business Intelligence and Analytics, BIRTE 2018, Rio de Janeiro, Brazil, August 27, 2018, M. Castellanos, P. K. Chrysanthis, B. Chandramouli, and S. Chen, Eds. ACM, 2018, pp. 2:1–2:10. [Online]. Available: https://doi.org/10.1145/3242153.3242160
- T. Li, J. Tang, and J. Xu, “Performance modeling and predictive scheduling for distributed stream data processing,” IEEE Trans. Big Data, vol. 2, no. 4, pp. 353–364, 2016. [Online]. Available: https://doi.org/10.1109/TBDATA.2016.2616148
- M. Luthra, B. Koldehofe, P. Weisenburger, G. Salvaneschi, and R. Arif, “TCEP: adapting to dynamic user environments by enabling transitions between operator placement mechanisms,” in Proceedings of the 12th ACM International Conference on Distributed and Event-based Systems, DEBS 2018, Hamilton, New Zealand, June 25-29, 2018, A. Hinze, D. M. Eyers, M. Hirzel, M. Weidlich, and S. Bhowmik, Eds. ACM, 2018, pp. 136–147. [Online]. Available: https://doi.org/10.1145/3210284.3210292
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.