Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
162 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Karasu: A Collaborative Approach to Efficient Cluster Configuration for Big Data Analytics (2308.11792v2)

Published 22 Aug 2023 in cs.DC and cs.LG

Abstract: Selecting the right resources for big data analytics jobs is hard because of the wide variety of configuration options like machine type and cluster size. As poor choices can have a significant impact on resource efficiency, cost, and energy usage, automated approaches are gaining popularity. Most existing methods rely on profiling recurring workloads to find near-optimal solutions over time. Due to the cold-start problem, this often leads to lengthy and costly profiling phases. However, big data analytics jobs across users can share many common properties: they often operate on similar infrastructure, using similar algorithms implemented in similar frameworks. The potential in sharing aggregated profiling runs to collaboratively address the cold start problem is largely unexplored. We present Karasu, an approach to more efficient resource configuration profiling that promotes data sharing among users working with similar infrastructures, frameworks, algorithms, or datasets. Karasu trains lightweight performance models using aggregated runtime information of collaborators and combines them into an ensemble method to exploit inherent knowledge of the configuration search space. Moreover, Karasu allows the optimization of multiple objectives simultaneously. Our evaluation is based on performance data from diverse workload executions in a public cloud environment. We show that Karasu is able to significantly boost existing methods in terms of performance, search time, and cost, even when few comparable profiling runs are available that share only partial common characteristics with the target job.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (32)
  1. K. Rzadca, P. Findeisen, J. Swiderski, P. Zych, P. Broniek, J. Kusmierek, P. Nowak, B. Strack, P. Witusowski, S. Hand, and J. Wilkes, “Autopilot: workload autoscaling at google,” in EuroSys.   ACM, 2020.
  2. Y. Al-Dhuraibi, F. Paraiso, N. Djarallah, and P. Merle, “Elasticity in cloud computing: State of the art and research challenges,” IEEE Transactions on Services Computing, vol. 11, no. 2, 2018.
  3. K. Rajan, D. Kakadia, C. Curino, and S. Krishnan, “Perforator: eloquent performance models for resource optimization,” in SoCC.   ACM, 2016.
  4. S. Venkataraman, Z. Yang, M. J. Franklin, B. Recht, and I. Stoica, “Ernest: Efficient performance prediction for large-scale advanced analytics,” in NSDI.   USENIX, 2016.
  5. S. Shah, Y. Amannejad, D. Krishnamurthy, and M. Wang, “Quick execution time predictions for spark applications,” in CNSM.   IEEE, 2019.
  6. H. Al-Sayeh and K. Sattler, “Gray box modeling methodology for runtime prediction of apache spark jobs,” in ICDE.   IEEE, 2019.
  7. D. F. Kirchoff, M. G. Xavier, J. Mastella, and C. A. F. D. Rose, “A preliminary study of machine learning workload prediction techniques for cloud applications,” in PDP.   IEEE, 2019.
  8. Y. Chen, L. Lin, B. Li, Q. Wang, and Q. Zhang, “Silhouette: Efficient cloud configuration exploration for large-scale analytics,” IEEE Trans. Parallel Distributed Syst., vol. 32, no. 8, 2021.
  9. H. Al-Sayeh, B. Memishi, M. A. Jibril, M. Paradies, and K. Sattler, “Juggler: Autonomous cost optimization and performance prediction of big data applications,” in SIGMOD.   ACM, 2022.
  10. O. Alipourfard, H. H. Liu, J. Chen, S. Venkataraman, M. Yu, and M. Zhang, “Cherrypick: Adaptively unearthing the best cloud configurations for big data analytics,” in NSDI.   USENIX, 2017.
  11. C. Hsu, V. Nair, V. W. Freeh, and T. Menzies, “Arrow: Low-level augmented bayesian optimization for finding the best cloud VM,” in ICDCS.   IEEE Computer Society, 2018.
  12. C. Hsu, V. Nair, T. Menzies, and V. W. Freeh, “Micky: A cheaper alternative for selecting cloud instances,” in CLOUD.   IEEE Computer Society, 2018.
  13. M. Bilal, M. Canini, and R. Rodrigues, “Finding the right cloud configuration for analytics clusters,” in SoCC.   ACM, 2020.
  14. A. Klimovic, H. Litz, and C. Kozyrakis, “Selecta: Heterogeneous cloud storage configuration for data analytics,” in ATC.   USENIX, 2018.
  15. A. Fekry, L. Carata, T. Pasquier, and A. Rice, “Accelerating the configuration tuning of big data analytics with similarity-aware multitask bayesian optimization,” in BigData.   IEEE, 2020.
  16. P. Mendes, M. Casimiro, P. Romano, and D. Garlan, “Trimtuner: Efficient optimization of machine learning jobs in the cloud via sub-sampling,” in MASCOTS.   IEEE, 2020.
  17. Y. Liu, H. Xu, and W. C. Lau, “Accordia: Adaptive cloud configuration optimization for recurring data-intensive applications,” in ICDCS.   IEEE, 2020.
  18. F. Song, K. Zaouk, C. Lyu, A. Sinha, Q. Fan, Y. Diao, and P. J. Shenoy, “Spark-based cloud data analytics using multi-objective optimization,” in ICDE.   IEEE, 2021.
  19. M. Casimiro, D. Didona, P. Romano, L. E. T. Rodrigues, W. Zwaenepoel, and D. Garlan, “Lynceus: Cost-efficient tuning and provisioning of data analytic jobs,” in ICDCS.   IEEE, 2020.
  20. M. Bilal, M. Serafini, M. Canini, and R. Rodrigues, “Do the best cloud configurations grow on trees? an experimental evaluation of black box algorithms for optimizing cloud workloads,” Proc. VLDB Endow., vol. 13, no. 11, 2020.
  21. A. Fekry, L. Carata, T. F. J. Pasquier, A. Rice, and A. Hopper, “To tune or not to tune?: In search of optimal configurations for data analytics,” in SIGKDD.   ACM, 2020.
  22. D. Scheinert, L. Thamsen, H. Zhu, J. Will, A. Acker, T. Wittkopp, and O. Kao, “Bellamy: Reusing performance models for distributed dataflow jobs across contexts,” in CLUSTER.   IEEE, 2021.
  23. J. Will, L. Thamsen, D. Scheinert, J. Bader, and O. Kao, “C3O: Collaborative Cluster Configuration Optimization for Distributed Data Processing in Public Clouds,” in IC2E.   IEEE, 2021.
  24. World Bank, “State and trends of carbon pricing 2020,” Washington, DC: World Bank., Tech. Rep., 2020.
  25. T. Eilam, “Towards transparent and trustworthy cloud carbon accounting,” in Extended Abstracts of Middleware ’21.   ACM, 2021.
  26. M. Feurer, B. Letham, F. Hutter, and E. Bakshy, “Practical transfer learning for bayesian optimization,” 2022.
  27. J. Will, L. Thamsen, J. Bader, D. Scheinert, and O. Kao, “Ruya: Memory-aware iterative optimization of cluster configurations for big data processing,” in BigData.   IEEE, 2022.
  28. M. Balandat, B. Karrer, D. R. Jiang, S. Daulton, B. Letham, A. G. Wilson, and E. Bakshy, “Botorch: A framework for efficient monte-carlo bayesian optimization,” in NeurIPS, 2020.
  29. G. Nguyen, S. Dlugolinsky, M. Bobák, V. D. Tran, Á. L. García, I. Heredia, P. Malík, and L. Hluchý, “Machine learning and deep learning frameworks and libraries for large-scale data mining: a survey,” Artif. Intell. Rev., vol. 52, no. 1, 2019.
  30. D. Scheinert, S. Becker, J. Bader, L. Thamsen, J. Will, and O. Kao, “Perona: Robust infrastructure fingerprinting for resource-efficient big data analytics,” in BigData.   IEEE, 2022.
  31. H. Al-Sayeh, M. A. Jibril, B. Memishi, and K. Sattler, “Blink: Lightweight sample runs for cost optimization of big data applications,” in ADBIS.   Springer, 2022.
  32. N. J. Yadwadkar, B. Hariharan, J. E. Gonzalez, B. Smith, and R. H. Katz, “Selecting the best VM across multiple public clouds: a data-driven performance modeling approach,” in SoCC.   ACM, 2017.
Citations (1)

Summary

We haven't generated a summary for this paper yet.