Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Autothrottle: A Practical Bi-Level Approach to Resource Management for SLO-Targeted Microservices (2212.12180v5)

Published 23 Dec 2022 in cs.DC and cs.LG

Abstract: Achieving resource efficiency while preserving end-user experience is non-trivial for cloud application operators. As cloud applications progressively adopt microservices, resource managers are faced with two distinct levels of system behavior: end-to-end application latency and per-service resource usage. Translating between the two levels, however, is challenging because user requests traverse heterogeneous services that collectively (but unevenly) contribute to the end-to-end latency. We present Autothrottle, a bi-level resource management framework for microservices with latency SLOs (service-level objectives). It architecturally decouples application SLO feedback from service resource control, and bridges them through the notion of performance targets. Specifically, an application-wide learning-based controller is employed to periodically set performance targets -- expressed as CPU throttle ratios -- for per-service heuristic controllers to attain. We evaluate Autothrottle on three microservice applications, with workload traces from production scenarios. Results show superior CPU savings, up to 26.21% over the best-performing baseline and up to 93.84% over all baselines.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (64)
  1. AWS Auto Scaling. https://aws.amazon.com/autoscaling/.
  2. AWS Predictive Scaling. https://docs.aws.amazon.com/autoscaling/ec2/userguide/ec2-auto-scaling-predictive-scaling.html.
  3. Azure Autoscale. https://azure.microsoft.com/en-us/products/virtual-machines/autoscale/.
  4. Google Cloud Autoscaler. https://cloud.google.com/compute/docs/autoscaler/.
  5. Kubernetes Autoscaling. https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/.
  6. Kubernetes Vertical Pod Autoscaler. https://github.com/kubernetes/autoscaler/tree/master/vertical-pod-autoscaler#vertical-pod-autoscaler.
  7. Locust: An Open Source Load Testing Tool. https://locust.io.
  8. Sinan Open-sourced Repository. https://github.com/zyqCSL/sinan-local.
  9. Twitter Data for Academic Research. https://developer.twitter.com/en/use-cases/do-research/academic-research/resources. Accessed in 2022.
  10. Vowpal Wabbit. https://vowpalwabbit.org.
  11. Adam Gluck. Introducing Domain-Oriented Microservice Architecture, 2020.
  12. Taming the monster: A fast and simple algorithm for contextual bandits. In International Conference on Machine Learning, pages 1638–1646. PMLR, 2014.
  13. Providing SLOs for Resource-Harvesting VMs in Cloud Platforms. In OSDI. USENIX, 2020.
  14. A contextual bandit bake-off. J. Mach. Learn. Res., 22:133–1, 2021.
  15. Dave Chiluk. Unthrottled: Fixing CPU Limits in the Cloud (blog post). https://engineering.indeedblog.com/blog/2019/12/unthrottled-fixing-cpu-limits-in-the-cloud/.
  16. Overload Control for μ𝜇\muitalic_μs-scale RPCs with Breakwater. In 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20), pages 299–314, 2020.
  17. pHPA: A Proactive Autoscaling Framework for Microservice Chain. In APNet. ACM, 2021.
  18. Tarcil: Reconciling Scheduling Speed and Quality in Large Shared Clusters. In SoCC. ACM, 2015.
  19. Characterizing Service Level Objectives for Cloud Services: Realities and Myths. In ICAC. IEEE, 2019.
  20. Doubly robust policy evaluation and learning. arXiv preprint arXiv:1103.4601, 2011.
  21. David Lo et al. Towards Energy Proportionality for Large-scale Latency-critical Workloads. In ISCA, 2014.
  22. An Open-Source Benchmark Suite for Microservices and Their Hardware-Software Implications for Cloud and Edge Systems. In ASPLOS. ACM, 2019.
  23. Seer: Leveraging Big Data to Navigate the Complexity of Performance Debugging in Cloud Microservices. In ASPLOS. ACM, 2019.
  24. ATOM: Model-driven Autoscaling for Microservices. In ICDCS. IEEE, 2019.
  25. Giulio Santoli. Microservices Architectures: Become a Unicorn like Netflix, Twitter and Hailo, 2016.
  26. PRESS: Predictive Elastic Resource Scaling for Cloud Systems. In CNSM. IEEE, 2010.
  27. Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center. In NSDI. USENIX, 2011.
  28. AlphaR: Learning-Powered Resource Management for Irregular, Dynamic Microservice Graph. In IPDPS. IEEE, 2021.
  29. PerfIso: Performance isolation for commercial latency-sensitive services. In ATC. USENIX, 2018.
  30. Scavenger: A Black-Box Batch Workload Resource Manager for Improving Utilization in Cloud Environments. In SoCC, 2019.
  31. Jeremy Cloud. Decomposing Twitter: Adventures in Service Oriented Architecture, 2013.
  32. Morpheus: Towards Automated SLOs for Enterprise Clusters. In OSDI, 2016.
  33. HyScale: Hybrid and Network Scaling of Dockerized Microservices in Cloud Data Centres. In ICDCS. IEEE, 2019.
  34. Kubernetes CPU Throttling: The Silent Killer of Response Time — and What to Do About It (blog post). https://community.ibm.com/community/user/aiops/blogs/dina-henderson/2022/06/29/kubernetes-cpu-throttling-the-silent-killer-of-res.
  35. The Epoch-Greedy Algorithm for Multi-Armed Bandits with Side Information. NIPS, 2007.
  36. Autothrottle: Satisfying Network Performance Requirements for Containers. IEEE Transactions on Cloud Computing, 2022.
  37. Stuart Lloyd. Least squares quantization in PCM. IEEE Transactions on Information Theory, 28(2):129–137, 1982.
  38. Serverless Computing: An Investigation of Factors Influencing Microservice Performance. In ICCE. IEEE, 2018.
  39. Heracles: Improving Resource Efficiency at Scale. In ISCA, 2015.
  40. Characterizing Microservice Dependency and Performance: Alibaba Trace Analysis. In SoCC. ACM, 2021.
  41. Learning Scheduling Algorithms for Data Processing Clusters. In SIGCOMM. ACM, 2019.
  42. AGILE: Elastic distributed resource scaling for infrastructure-as-a-service. In 10th International Conference on Autonomic Computing (ICAC 13), pages 69–82, 2013.
  43. GRAF: A graph neural network based proactive resource allocation framework for SLO-oriented microservices. In Proceedings of the 17th International Conference on emerging Networking EXperiments and Technologies, pages 154–167, 2021.
  44. FIRM: An Intelligent Fine-grained Resource Management Framework for SLO-Oriented Microservices. In OSDI. ACM, 2020.
  45. Exploring potential for non-disruptive vertical auto scaling and resource estimation in Kubernetes. In 2019 IEEE 12th International Conference on Cloud Computing (CLOUD), pages 33–40. IEEE, 2019.
  46. Autopilot: workload autoscaling at Google. In Proceedings of the Fifteenth European Conference on Computer Systems, pages 1–16, 2020.
  47. Collective autoscaling for cloud microservices, 2021. arXiv:2112.14845.
  48. Recommendations as treatments: Debiasing learning and evaluation. In International Conference on Machine Learning, pages 1670–1679. PMLR, 2016.
  49. Omega: flexible, scalable schedulers for large compute clusters. In Proceedings of the 8th ACM European Conference on Computer Systems, pages 351–364, 2013.
  50. [SoK] identifying mismatches between microservice testbeds and industrial perceptions of microservices. Journal of Systems Research, 2(1), 2022.
  51. CloudScale: elastic resource scaling for multi-tenant cloud systems. In Proceedings of the 2nd ACM Symposium on Cloud Computing, pages 1–14, 2011.
  52. Software Engineering Laboratory of Fudan University. Train Ticket: A Benchmark Microservice System. https://github.com/FudanSELab/train-ticket.
  53. μ𝜇\muitalic_μTune: Auto-tuned threading for OLDI microservices. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18), pages 177–194, 2018.
  54. Reinforcement learning: An introduction (second edition). MIT press, 2020.
  55. Apache Hadoop YARN: Yet Another Resource Negotiator. In SoCC, pages 1–16. ACM, 2013.
  56. Large-scale cluster management at Google with Borg. In Proceedings of the Tenth European Conference on Computer Systems, pages 1–17, 2015.
  57. SmartHarvest: Harvesting Idle CPUs Safely and Efficiently in the Cloud. In EuroSys. ACM, 2021.
  58. John Wilkes. Google cluster data – 2019 traces. https://github.com/google/cluster-data/blob/master/ClusterData2019.md, 2020.
  59. Genet: Automatic curriculum generation for learning adaptation in networking. In Proceedings of the ACM SIGCOMM 2022 Conference, pages 397–413, 2022.
  60. Learning in situ: a randomized experiment in video streaming. In 17th USENIX Symposium on Networked Systems Design and Implementation (NSDI 20), pages 495–511, Santa Clara, CA, February 2020. USENIX Association.
  61. PowerChief: Intelligent Power Allocation for Multi-Stage Applications to Improve Responsiveness on Power Constrained CMP. In Proceedings of the 44th Annual International Symposium on Computer Architecture, pages 133–146, 2017.
  62. Faster and Cheaper Serverless Computing on Harvested Resources. In SOSP. ACM, 2021.
  63. Sinan: ML-based and QoS-Aware Resource Management for Cloud Microservices. In ASPLOS. ACM, 2021.
  64. Overload Control for Scaling WeChat Microservices. In SoCC. ACM, 2018.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets