Autothrottle: A Practical Bi-Level Approach to Resource Management for SLO-Targeted Microservices (2212.12180v5)
Abstract: Achieving resource efficiency while preserving end-user experience is non-trivial for cloud application operators. As cloud applications progressively adopt microservices, resource managers are faced with two distinct levels of system behavior: end-to-end application latency and per-service resource usage. Translating between the two levels, however, is challenging because user requests traverse heterogeneous services that collectively (but unevenly) contribute to the end-to-end latency. We present Autothrottle, a bi-level resource management framework for microservices with latency SLOs (service-level objectives). It architecturally decouples application SLO feedback from service resource control, and bridges them through the notion of performance targets. Specifically, an application-wide learning-based controller is employed to periodically set performance targets -- expressed as CPU throttle ratios -- for per-service heuristic controllers to attain. We evaluate Autothrottle on three microservice applications, with workload traces from production scenarios. Results show superior CPU savings, up to 26.21% over the best-performing baseline and up to 93.84% over all baselines.
- AWS Auto Scaling. https://aws.amazon.com/autoscaling/.
- AWS Predictive Scaling. https://docs.aws.amazon.com/autoscaling/ec2/userguide/ec2-auto-scaling-predictive-scaling.html.
- Azure Autoscale. https://azure.microsoft.com/en-us/products/virtual-machines/autoscale/.
- Google Cloud Autoscaler. https://cloud.google.com/compute/docs/autoscaler/.
- Kubernetes Autoscaling. https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/.
- Kubernetes Vertical Pod Autoscaler. https://github.com/kubernetes/autoscaler/tree/master/vertical-pod-autoscaler#vertical-pod-autoscaler.
- Locust: An Open Source Load Testing Tool. https://locust.io.
- Sinan Open-sourced Repository. https://github.com/zyqCSL/sinan-local.
- Twitter Data for Academic Research. https://developer.twitter.com/en/use-cases/do-research/academic-research/resources. Accessed in 2022.
- Vowpal Wabbit. https://vowpalwabbit.org.
- Adam Gluck. Introducing Domain-Oriented Microservice Architecture, 2020.
- Taming the monster: A fast and simple algorithm for contextual bandits. In International Conference on Machine Learning, pages 1638–1646. PMLR, 2014.
- Providing SLOs for Resource-Harvesting VMs in Cloud Platforms. In OSDI. USENIX, 2020.
- A contextual bandit bake-off. J. Mach. Learn. Res., 22:133–1, 2021.
- Dave Chiluk. Unthrottled: Fixing CPU Limits in the Cloud (blog post). https://engineering.indeedblog.com/blog/2019/12/unthrottled-fixing-cpu-limits-in-the-cloud/.
- Overload Control for μ𝜇\muitalic_μs-scale RPCs with Breakwater. In 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20), pages 299–314, 2020.
- pHPA: A Proactive Autoscaling Framework for Microservice Chain. In APNet. ACM, 2021.
- Tarcil: Reconciling Scheduling Speed and Quality in Large Shared Clusters. In SoCC. ACM, 2015.
- Characterizing Service Level Objectives for Cloud Services: Realities and Myths. In ICAC. IEEE, 2019.
- Doubly robust policy evaluation and learning. arXiv preprint arXiv:1103.4601, 2011.
- David Lo et al. Towards Energy Proportionality for Large-scale Latency-critical Workloads. In ISCA, 2014.
- An Open-Source Benchmark Suite for Microservices and Their Hardware-Software Implications for Cloud and Edge Systems. In ASPLOS. ACM, 2019.
- Seer: Leveraging Big Data to Navigate the Complexity of Performance Debugging in Cloud Microservices. In ASPLOS. ACM, 2019.
- ATOM: Model-driven Autoscaling for Microservices. In ICDCS. IEEE, 2019.
- Giulio Santoli. Microservices Architectures: Become a Unicorn like Netflix, Twitter and Hailo, 2016.
- PRESS: Predictive Elastic Resource Scaling for Cloud Systems. In CNSM. IEEE, 2010.
- Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center. In NSDI. USENIX, 2011.
- AlphaR: Learning-Powered Resource Management for Irregular, Dynamic Microservice Graph. In IPDPS. IEEE, 2021.
- PerfIso: Performance isolation for commercial latency-sensitive services. In ATC. USENIX, 2018.
- Scavenger: A Black-Box Batch Workload Resource Manager for Improving Utilization in Cloud Environments. In SoCC, 2019.
- Jeremy Cloud. Decomposing Twitter: Adventures in Service Oriented Architecture, 2013.
- Morpheus: Towards Automated SLOs for Enterprise Clusters. In OSDI, 2016.
- HyScale: Hybrid and Network Scaling of Dockerized Microservices in Cloud Data Centres. In ICDCS. IEEE, 2019.
- Kubernetes CPU Throttling: The Silent Killer of Response Time — and What to Do About It (blog post). https://community.ibm.com/community/user/aiops/blogs/dina-henderson/2022/06/29/kubernetes-cpu-throttling-the-silent-killer-of-res.
- The Epoch-Greedy Algorithm for Multi-Armed Bandits with Side Information. NIPS, 2007.
- Autothrottle: Satisfying Network Performance Requirements for Containers. IEEE Transactions on Cloud Computing, 2022.
- Stuart Lloyd. Least squares quantization in PCM. IEEE Transactions on Information Theory, 28(2):129–137, 1982.
- Serverless Computing: An Investigation of Factors Influencing Microservice Performance. In ICCE. IEEE, 2018.
- Heracles: Improving Resource Efficiency at Scale. In ISCA, 2015.
- Characterizing Microservice Dependency and Performance: Alibaba Trace Analysis. In SoCC. ACM, 2021.
- Learning Scheduling Algorithms for Data Processing Clusters. In SIGCOMM. ACM, 2019.
- AGILE: Elastic distributed resource scaling for infrastructure-as-a-service. In 10th International Conference on Autonomic Computing (ICAC 13), pages 69–82, 2013.
- GRAF: A graph neural network based proactive resource allocation framework for SLO-oriented microservices. In Proceedings of the 17th International Conference on emerging Networking EXperiments and Technologies, pages 154–167, 2021.
- FIRM: An Intelligent Fine-grained Resource Management Framework for SLO-Oriented Microservices. In OSDI. ACM, 2020.
- Exploring potential for non-disruptive vertical auto scaling and resource estimation in Kubernetes. In 2019 IEEE 12th International Conference on Cloud Computing (CLOUD), pages 33–40. IEEE, 2019.
- Autopilot: workload autoscaling at Google. In Proceedings of the Fifteenth European Conference on Computer Systems, pages 1–16, 2020.
- Collective autoscaling for cloud microservices, 2021. arXiv:2112.14845.
- Recommendations as treatments: Debiasing learning and evaluation. In International Conference on Machine Learning, pages 1670–1679. PMLR, 2016.
- Omega: flexible, scalable schedulers for large compute clusters. In Proceedings of the 8th ACM European Conference on Computer Systems, pages 351–364, 2013.
- [SoK] identifying mismatches between microservice testbeds and industrial perceptions of microservices. Journal of Systems Research, 2(1), 2022.
- CloudScale: elastic resource scaling for multi-tenant cloud systems. In Proceedings of the 2nd ACM Symposium on Cloud Computing, pages 1–14, 2011.
- Software Engineering Laboratory of Fudan University. Train Ticket: A Benchmark Microservice System. https://github.com/FudanSELab/train-ticket.
- μ𝜇\muitalic_μTune: Auto-tuned threading for OLDI microservices. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18), pages 177–194, 2018.
- Reinforcement learning: An introduction (second edition). MIT press, 2020.
- Apache Hadoop YARN: Yet Another Resource Negotiator. In SoCC, pages 1–16. ACM, 2013.
- Large-scale cluster management at Google with Borg. In Proceedings of the Tenth European Conference on Computer Systems, pages 1–17, 2015.
- SmartHarvest: Harvesting Idle CPUs Safely and Efficiently in the Cloud. In EuroSys. ACM, 2021.
- John Wilkes. Google cluster data – 2019 traces. https://github.com/google/cluster-data/blob/master/ClusterData2019.md, 2020.
- Genet: Automatic curriculum generation for learning adaptation in networking. In Proceedings of the ACM SIGCOMM 2022 Conference, pages 397–413, 2022.
- Learning in situ: a randomized experiment in video streaming. In 17th USENIX Symposium on Networked Systems Design and Implementation (NSDI 20), pages 495–511, Santa Clara, CA, February 2020. USENIX Association.
- PowerChief: Intelligent Power Allocation for Multi-Stage Applications to Improve Responsiveness on Power Constrained CMP. In Proceedings of the 44th Annual International Symposium on Computer Architecture, pages 133–146, 2017.
- Faster and Cheaper Serverless Computing on Harvested Resources. In SOSP. ACM, 2021.
- Sinan: ML-based and QoS-Aware Resource Management for Cloud Microservices. In ASPLOS. ACM, 2021.
- Overload Control for Scaling WeChat Microservices. In SoCC. ACM, 2018.