Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
167 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Multi-Level ML Based Burst-Aware Autoscaling for SLO Assurance and Cost Efficiency (2402.12962v1)

Published 20 Feb 2024 in cs.SE

Abstract: Autoscaling is a technology to automatically scale the resources provided to their applications without human intervention to guarantee runtime Quality of Service (QoS) while saving costs. However, user-facing cloud applications serve dynamic workloads that often exhibit variable and contain bursts, posing challenges to autoscaling for maintaining QoS within Service-Level Objectives (SLOs). Conservative strategies risk over-provisioning, while aggressive ones may cause SLO violations, making it more challenging to design effective autoscaling. This paper introduces BAScaler, a Burst-Aware Autoscaling framework for containerized cloud services or applications under complex workloads, combining multi-level ML techniques to mitigate SLO violations while saving costs. BAScaler incorporates a novel prediction-based burst detection mechanism that distinguishes between predictable periodic workload spikes and actual bursts. When bursts are detected, BAScaler appropriately overestimates them and allocates resources accordingly to address the rapid growth in resource demand. On the other hand, BAScaler employs reinforcement learning to rectify potential inaccuracies in resource estimation, enabling more precise resource allocation during non-bursts. Experiments across ten real-world workloads demonstrate BAScaler's effectiveness, achieving a 57% average reduction in SLO violations and cutting resource costs by 10% compared to other prominent methods.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (44)
  1. T. Chen and R. Bahsoon, “Self-adaptive and online qos modeling for cloud-based software services,” IEEE Transactions on Software Engineering, vol. 43, no. 5, pp. 453–475, 2017.
  2. R. Buyya, S. N. Srirama, G. Casale, R. Calheiros, Y. Simmhan, B. Varghese, E. Gelenbe, B. Javadi, L. M. Vaquero, M. A. S. Netto, A. N. Toosi, M. A. Rodriguez, I. M. Llorente, S. D. C. D. Vimercati, P. Samarati, D. Milojicic, C. Varela, R. Bahsoon, M. D. D. Assuncao, O. Rana, W. Zhou, H. Jin, W. Gentzsch, A. Y. Zomaya, and H. Shen, “A manifesto for future generation cloud computing: Research directions for the next decade,” ACM Comput. Surv., vol. 51, no. 5, nov 2018. [Online]. Available: https://doi.org/10.1145/3241737
  3. Y. Al-Dhuraibi, F. Paraiso, N. Djarallah, and P. Merle, “Elasticity in cloud computing: State of the art and research challenges,” IEEE Transactions on Services Computing, vol. 11, no. 2, pp. 430–447, 2018.
  4. Y. Li, Y. Lin, Y. Wang, K. Ye, and C. Xu, “Serverless computing: State-of-the-art, challenges and opportunities,” IEEE Transactions on Services Computing, vol. 16, no. 2, pp. 1522–1539, 2023.
  5. C. Qu, R. N. Calheiros, and R. Buyya, “Auto-scaling web applications in clouds: A taxonomy and survey,” ACM Comput. Surv., vol. 51, no. 4, jul 2018. [Online]. Available: https://doi.org/10.1145/3148149
  6. (2022) What is Auto Scaling? [Online]. Available: https://www.alibabacloud.com/help/en/auto-scaling/product-overview/what-is-auto-scaling
  7. T. Chen, R. Bahsoon, and X. Yao, “A survey and taxonomy of self-aware and self-adaptive cloud autoscaling systems,” ACM Comput. Surv., vol. 51, no. 3, jun 2018. [Online]. Available: https://doi.org/10.1145/3190507
  8. G. Yu, P. Chen, and Z. Zheng, “Microscaler: Cost-effective scaling for microservice applications in the cloud with an online learning approach,” IEEE Transactions on Cloud Computing, vol. 10, no. 2, pp. 1100–1116, 2022.
  9. Z. Ding and Q. Huang, “Copa: A combined autoscaling method for kubernetes,” in 2021 IEEE International Conference on Web Services (ICWS), 2021, pp. 416–425.
  10. V. Tadakamalla and D. Menasce, “Autonomic elasticity control for multi-server queues under generic workload surges in cloud environments,” IEEE Transactions on Cloud Computing, 2020.
  11. M. Imdoukh, I. Ahmad, and M. G. Alfailakawi, “Machine learning-based auto-scaling for containerized applications,” Neural Computing and Applications, vol. 32, no. 13, pp. 9745–9760, 2020.
  12. E. Golshani and M. Ashtiani, “Proactive auto-scaling for cloud environments using temporal convolutional neural networks,” Journal of Parallel and Distributed Computing, vol. 154, pp. 119–141, 2021.
  13. V. Rampérez, J. Soriano, D. Lizcano, and J. A. Lara, “Flas: A combination of proactive and reactive auto-scaling architecture for distributed services,” Future Generation Computer Systems, vol. 118, pp. 56–72, 2021.
  14. H. Qian, Q. Wen, L. Sun, J. Gu, Q. Niu, and Z. Tang, “Robustscaler: Qos-aware autoscaling for complex workloads,” in 2022 IEEE 38th International Conference on Data Engineering (ICDE), 2022, pp. 2762–2775.
  15. A. Bauer, N. Herbst, S. Spinner, A. Ali-Eldin, and S. Kounev, “Chameleon: A hybrid, proactive auto-scaling mechanism on a level-playing field,” IEEE Transactions on Parallel and Distributed Systems, vol. 30, no. 4, pp. 800–813, 2019.
  16. X. Chen, L. Yang, Z. Chen, G. Min, X. Zheng, and C. Rong, “Resource allocation with workload-time windows for cloud-based software services: A deep reinforcement learning approach,” IEEE Transactions on Cloud Computing, 2022.
  17. L. Schuler, S. Jamil, and N. Kühl, “Ai-based resource allocation: Reinforcement learning for adaptive auto-scaling in serverless environments,” in 2021 IEEE/ACM 21st International Symposium on Cluster, Cloud and Internet Computing (CCGrid), 2021, pp. 804–811.
  18. M. Yan, X. Liang, Z. Lu, J. Wu, and W. Zhang, “Hansel: Adaptive horizontal scaling of microservices using bi-lstm,” Applied Soft Computing, vol. 105, p. 107216, 2021. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1568494621001393
  19. F. Tahir, M. Abdullah, F. Bukhari, K. M. Almustafa, and W. Iqbal, “Online workload burst detection for efficient predictive autoscaling of applications,” IEEE Access, vol. 8, pp. 73 730–73 745, 2020.
  20. Z. Zhong, M. Xu, M. A. Rodriguez, C. Xu, and R. Buyya, “Machine learning-based orchestration of containers: A taxonomy and future directions,” ACM Comput. Surv., jan 2022, just Accepted. [Online]. Available: https://doi.org/10.1145/3510415
  21. M. I. Jordan and T. M. Mitchell, “Machine learning: Trends, perspectives, and prospects,” Science, vol. 349, no. 6245, pp. 255–260, 2015. [Online]. Available: https://www.science.org/doi/abs/10.1126/science.aaa8415
  22. Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” nature, vol. 521, no. 7553, pp. 436–444, 2015.
  23. L. P. Kaelbling, M. L. Littman, and A. W. Moore, “Reinforcement learning: A survey,” Journal of artificial intelligence research, vol. 4, pp. 237–285, 1996.
  24. M. C. Calzarossa, L. Massari, and D. Tessera, “Workload characterization: A survey revisited,” ACM Computing Surveys (CSUR), vol. 48, no. 3, pp. 1–43, 2016.
  25. C. Reiss, J. Wilkes, and J. L. Hellerstein, “Google cluster-usage traces: format+ schema,” Google Inc., White Paper, vol. 1, pp. 1–14, 2011.
  26. World cup web site access logs. [Online]. Available: ftp://ita.ee.lbl.gov/html/contrib/WorldCup.html
  27. A. Ali-Eldin, J. Tordsson, E. Elmroth, and M. Kihl, “Workload classification for efficient auto-scaling of cloud resources,” Department of Computer Science, Umea University, Umea, Sweden, Tech. Rep, 2013.
  28. A. Ali-Eldin, O. Seleznjev, S. Sjöstedt-de Luna, J. Tordsson, and E. Elmroth, “Measuring cloud workload burstiness,” in 2014 IEEE/ACM 7th International Conference on Utility and Cloud Computing, 2014, pp. 566–572.
  29. R. Gusella, “Characterizing the variability of arrival processes with indexes of dispersion,” IEEE Journal on Selected Areas in Communications, vol. 9, no. 2, pp. 203–211, 1991.
  30. T. N. Minh, L. Wolters, and D. Epema, “A realistic integrated model of parallel system workloads,” in 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing, 2010, pp. 464–473.
  31. M. Vlachos, C. Meek, Z. Vagena, and D. Gunopulos, “Identifying similarities, periodicities and bursts for online search queries,” in Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data, ser. SIGMOD ’04.   New York, NY, USA: Association for Computing Machinery, 2004, p. 131–142. [Online]. Available: https://doi.org/10.1145/1007568.1007586
  32. D. Trihinas, Z. Georgiou, G. Pallis, and M. D. Dikaiakos, “Improving rule-based elasticity control by adapting the sensitivity of the auto-scaling decision timeframe,” in Algorithmic Aspects of Cloud Computing, D. Alistarh, A. Delis, and G. Pallis, Eds.   Cham: Springer International Publishing, 2018, pp. 123–137.
  33. M. Lassnig, T. Fahringer, V. Garonne, A. Molfetas, and M. Branco, “Identification, modelling and prediction of non-periodic bursts in workloads,” in 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing, 2010, pp. 485–494.
  34. A. Adegboyega, “Quantifying cloud workload burstiness: New measures and models,” in 2017 IFIP/IEEE Symposium on Integrated Network and Service Management (IM), 2017, pp. 987–990.
  35. (2022) Python client for the kubernetes api. [Online]. Available: https://github.com/kubernetes-client/python
  36. H. Zhou, S. Zhang, J. Peng, S. Zhang, J. Li, H. Xiong, and W. Zhang, “Informer: Beyond efficient transformer for long sequence time-series forecasting,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no. 12, pp. 11 106–11 115, May 2021. [Online]. Available: https://ojs.aaai.org/index.php/AAAI/article/view/17325
  37. T. Bollerslev, “Generalized autoregressive conditional heteroskedasticity,” Journal of Econometrics, vol. 31, no. 3, pp. 307–327, 1986. [Online]. Available: https://www.sciencedirect.com/science/article/pii/0304407686900631
  38. M. Awad, R. Khanna, M. Awad, and R. Khanna, “Support vector regression,” Efficient learning machines: Theories, concepts, and applications for engineers and system designers, pp. 67–80, 2015.
  39. J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Proximal Policy Optimization Algorithms,” arXiv e-prints, p. arXiv:1707.06347, Jul. 2017.
  40. (2021) Horizontal pod autoscaling — kubernetes. [Online]. Available: https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/
  41. S. Zhang, T. Wu, M. Pan, C. Zhang, and Y. Yu, “A-sarsa: A predictive container auto-scaling algorithm based on reinforcement learning,” in 2020 IEEE International Conference on Web Services (ICWS), 2020, pp. 489–497.
  42. M. Abdullah, W. Iqbal, J. L. Berral, J. Polo, and D. Carrera, “Burst-aware predictive autoscaling for containerized microservices,” IEEE Transactions on Services Computing, vol. 15, no. 3, pp. 1448–1460, 2022.
  43. Y. Garí, D. A. Monge, E. Pacini, C. Mateos, and C. García Garino, “Reinforcement learning-based application autoscaling in the cloud: A survey,” Engineering Applications of Artificial Intelligence, vol. 102, p. 104288, 2021. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0952197621001354
  44. K. Cheng, S. Zhang, C. Tu, X. Shi, Z. Yin, S. Lu, Y. Liang, and Q. Gu, “Proscale: Proactive autoscaling for microservice with time-varying workload at the edge,” IEEE Transactions on Parallel and Distributed Systems, pp. 1–18, 2023.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com