Cloud-Native Computing: A Survey from the Perspective of Services (2306.14402v1)
Abstract: The development of cloud computing delivery models inspires the emergence of cloud-native computing. Cloud-native computing, as the most influential development principle for web applications, has already attracted increasingly more attention in both industry and academia. Despite the momentum in the cloud-native industrial community, a clear research roadmap on this topic is still missing. As a contribution to this knowledge, this paper surveys key issues during the life-cycle of cloud-native applications, from the perspective of services. Specifically, we elaborate the research domains by decoupling the life-cycle of cloud-native applications into four states: building, orchestration, operate, and maintenance. We also discuss the fundamental necessities and summarize the key performance metrics that play critical roles during the development and management of cloud-native applications. We highlight the key implications and limitations of existing works in each state. The challenges, future directions, and research opportunities are also discussed.
- M. P. Papazoglou, “Service-oriented computing: Concepts, characteristics and directions,” in Proceedings of the Fourth International Conference on Web Information Systems Engineering, 2003. WISE 2003. IEEE, 2003, pp. 3–12.
- M. P. Papazoglou, P. Traverso, S. Dustdar, and F. Leymann, “Service-oriented computing: a research roadmap,” International Journal of Cooperative Information Systems, vol. 17, no. 02, pp. 223–255, 2008.
- N. Alshuqayran, N. Ali, and R. Evans, “A systematic mapping study in microservice architecture,” in 2016 IEEE 9th International Conference on Service-Oriented Computing and Applications (SOCA). IEEE, 2016, pp. 44–51.
- L. Leite, C. Rocha, F. Kon, D. Milojicic, and P. Meirelles, “A survey of devops concepts and challenges,” ACM Computing Surveys (CSUR), vol. 52, no. 6, pp. 1–35, 2019.
- F. Zampetti, S. Geremia, G. Bavota, and M. Di Penta, “CI/CD pipelines evolution and restructuring: A qualitative and quantitative study,” in 2021 IEEE International Conference on Software Maintenance and Evolution (ICSME). IEEE, 2021, pp. 471–482.
- S. Barlev, Z. Basil, S. Kohanim, R. Peleg, S. Regev, and A. Shulman-Peleg, “Secure yet usable: Protecting servers and linux containers,” IBM Journal of Research and Development, vol. 60, no. 4, pp. 12–1, 2016.
- M. Mattetti, A. Shulman-Peleg, Y. Allouche, A. Corradi, S. Dolev, and L. Foschini, “Securing the infrastructure and the workloads of linux containers,” in 2015 IEEE Conference on Communications and Network Security (CNS). IEEE, 2015, pp. 559–567.
- “Docker: Modernize your applications, accelerate innovation,” [n.d.]. [Online]. Available: https://www.docker.com/
- “Kubernetes: Production-grade container orchestration.” [Online]. Available: https://kubernetes.io/
- A. Verma, L. Pedrosa, M. Korupolu, D. Oppenheimer, E. Tune, and J. Wilkes, “Large-scale cluster management at google with borg,” in Proceedings of the Tenth European Conference on Computer Systems, 2015, pp. 1–17.
- N. Kratzke and P.-C. Quint, “Understanding cloud-native applications after 10 years of cloud computing-a systematic mapping study,” Journal of Systems and Software, vol. 126, pp. 1–16, 2017.
- N. Kratzke, “A brief history of cloud application architectures,” Applied Sciences, vol. 8, no. 8, p. 1368, 2018.
- G. Gil, D. Corujo, and P. Pedreiras, “Cloud native computing for industry 4.0: Challenges and opportunities,” in 2021 26th IEEE International Conference on Emerging Technologies and Factory Automation (ETFA). IEEE, 2021, pp. 01–04.
- Q. Duan, “Intelligent and autonomous management in cloud-native future networks—a survey on related standards from an architectural perspective,” Future Internet, vol. 13, no. 2, p. 42, 2021.
- A. Senthuran and S. Hettiarachchi, “A review of dynamic scalability and dynamic scheduling in cloud-native distributed stream processing systems,” ICDSMLA 2019, pp. 1539–1553, 2020.
- C. Carrión, “Kubernetes scheduling: Taxonomy, ongoing issues and challenges,” ACM Computing Surveys (CSUR), 2022.
- B. Hindman, A. Konwinski, M. Zaharia, A. Ghodsi, A. D. Joseph, R. Katz, S. Shenker, and I. Stoica, “Mesos: A platform for Fine-Grained resource sharing in the data center,” in 8th USENIX Symposium on Networked Systems Design and Implementation (NSDI 11). Boston, MA: USENIX Association, Mar. 2011. [Online]. Available: https://www.usenix.org/conference/nsdi11/mesos-platform-fine-grained-resource-sharing-data-center
- N. Naik, “Building a virtual system of systems using docker swarm in multiple clouds,” in 2016 IEEE International Symposium on Systems Engineering (ISSE). IEEE, 2016, pp. 1–3.
- V. K. Vavilapalli, A. C. Murthy, C. Douglas, S. Agarwal, M. Konar, R. Evans, T. Graves, J. Lowe, H. Shah, S. Seth et al., “Apache hadoop yarn: Yet another resource negotiator,” in Proceedings of the 4th annual Symposium on Cloud Computing, 2013, pp. 1–16.
- W. Li, Y. Lemieux, J. Gao, Z. Zhao, and Y. Han, “Service mesh: Challenges, state of the art, and future research opportunities,” in 2019 IEEE International Conference on Service-Oriented System Engineering (SOSE). IEEE, 2019, pp. 122–1225.
- F. Liu, G. Tang, Y. Li, Z. Cai, X. Zhang, and T. Zhou, “A survey on edge computing systems and tools,” Proceedings of the IEEE, vol. 107, no. 8, pp. 1537–1562, 2019.
- W. Xu, “Test report on kubeedge’s support for 100,000 edge nodes,” https://kubeedge.io/en/blog/scalability-test-report/, 2022.
- “Kernel-based virtual machine.” [Online]. Available: https://www.redhat.com/en/topics/virtualization/what-is-KVM
- R. Morabito, J. Kjällman, and M. Komu, “Hypervisors vs. lightweight virtualization: a performance comparison,” in 2015 IEEE International Conference on cloud engineering. IEEE, 2015, pp. 386–393.
- K. Kushwaha and N. Center, “How container runtimes matter in kubernetes?” 2017.
- D. B. Rawat and S. R. Reddy, “Software defined networking architecture, security and energy efficiency: A survey,” IEEE Communications Surveys & Tutorials, vol. 19, no. 1, pp. 325–346, 2016.
- J. Deng, H. Hu, H. Li, Z. Pan, K.-C. Wang, G.-J. Ahn, J. Bi, and Y. Park, “Vnguard: An nfv/sdn combination framework for provisioning and managing virtual firewalls,” in 2015 IEEE Conference on Network Function Virtualization and Software Defined Network (NFV-SDN). IEEE, 2015, pp. 107–114.
- M. A. Harrabi, M. Jeridi, N. Amri, M. R. Jerbi, A. Jhine, and H. Khamassi, “Implementing nfv routers and sdn controllers in mpls architecture,” in 2015 World Congress on Information Technology and Computer Applications (WCITCA). IEEE, 2015, pp. 1–6.
- M. Mahalingam, D. Dutt, K. Duda, P. Agarwal, L. Kreeger, T. Sridhar, and C. W. Mike Bursell, “Virtual extensible local area network (vxlan): A framework for overlaying virtualized layer 2 networks over layer 3 networks,” https://datatracker.ietf.org/doc/rfc7348/, 2020.
- Y. Park, H. Yang, and Y. Kim, “Performance analysis of cni (container networking interface) based container network,” in 2018 International Conference on Information and Communication Technology Convergence (ICTC), 2018, pp. 248–250.
- S. Qi, S. G. Kulkarni, and K. K. Ramakrishnan, “Assessing container network interface plugins: Functionality, performance, and scalability,” IEEE Transactions on Network and Service Management, vol. 18, no. 1, pp. 656–671, 2021.
- R. Kumar and M. C. Trivedi, “Networking analysis and performance comparison of kubernetes cni plugins,” in Advances in Computer, Communication and Computational Sciences. Springer, 2021, pp. 99–109.
- L. Leite, C. Rocha, F. Kon, D. Milojicic, and P. Meirelles, “A survey of devops concepts and challenges,” ACM Comput. Surv., vol. 52, no. 6, nov 2019. [Online]. Available: https://doi.org/10.1145/3359981
- T. K. Community, “Kubesphere devops: A powerful ci/cd platform built on top of kubernetes for devops-oriented teams,” https://kubesphere.io/devops/, 2022.
- H. Chen, S. Deng, H. Zhu, H. Zhao, R. Jiang, S. Dustdar, and A. Y. Zomaya, “Mobility-Aware Offloading and Resource Allocation for Distributed Services Collaboration,” IEEE Transactions on Parallel and Distributed Systems, vol. 33, no. 10, pp. 2428–2443, 2022.
- C. Zheng, Q. Zhuang, and F. Guo, “A Multi-Tenant Framework for Cloud Container Services,” in 2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS). IEEE, 2021, pp. 359–369.
- Ł. Wojciechowski, K. Opasiak, J. Latusek, M. Wereski, V. Morales, T. Kim, and M. Hong, “NetMARKS: Network metrics-AwaRe kubernetes scheduler powered by service mesh,” in IEEE INFOCOM 2021-IEEE Conference on Computer Communications. IEEE, 2021, pp. 1–9.
- Y. Fu, S. Zhang, J. Terrero, Y. Mao, G. Liu, S. Li, and D. Tao, “Progress-based container scheduling for short-lived applications in a kubernetes cluster,” in 2019 IEEE International Conference on Big Data (Big Data). IEEE, 2019, pp. 278–287.
- S. Deng, H. Zhao, Z. Xiang, C. Zhang, R. Jiang, Y. Li, J. Yin, S. Dustdar, and A. Y. Zomaya, “Dependent Function Embedding for Distributed Serverless Edge Computing,” IEEE Transactions on Parallel and Distributed Systems, vol. 33, no. 10, pp. 2346–2357, 2021.
- M. C. Ogbuachi, C. Gore, A. Reale, P. Suskovics, and B. Kovács, “Context-aware K8S scheduler for real time distributed 5G edge computing applications,” in 2019 International Conference on Software, Telecommunications and Computer Networks (SoftCOM). IEEE, 2019, pp. 1–6.
- G. Zhang, R. Lu, and W. Wu, “Multi-resource fair allocation for cloud federation,” in 2019 IEEE 21st International Conference on High Performance Computing and Communications; IEEE 17th International Conference on Smart City; IEEE 5th International Conference on Data Science and Systems (HPCC/SmartCity/DSS). IEEE, 2019, pp. 2189–2194.
- G. Yeung, D. Borowiec, R. Yang, A. Friday, R. Harper, and P. Garraghan, “Horus: Interference-aware and prediction-based scheduling in deep learning systems,” IEEE Transactions on Parallel and Distributed Systems, vol. 33, no. 1, pp. 88–100, 2021.
- Y. Han, S. Shen, X. Wang, S. Wang, and V. C. M. Leung, “Tailored learning-based scheduling for kubernetes-oriented edge-cloud system,” in IEEE INFOCOM 2021-IEEE Conference on Computer Communications. IEEE, 2021, pp. 1–10.
- T. Pusztai, A. Morichetta, V. C. Pujol, S. Dustdar, S. Nastic, X. Ding, D. Vij, and Y. Xiong, “Slo script: A novel language for implementing complex cloud-native elasticity-driven slos,” in 2021 IEEE International Conference on Web Services (ICWS). IEEE, 2021, pp. 21–31.
- ——, “A novel middleware for efficiently implementing complex cloud-native slos,” in 2021 IEEE 14th International Conference on Cloud Computing (CLOUD), 2021, pp. 410–420.
- M. Imdoukh, I. Ahmad, and M. G. Alfailakawi, “Machine learning-based auto-scaling for containerized applications,” Neural Computing and Applications, vol. 32, no. 13, pp. 9745–9760, 2020.
- R. Pinciroli, A. Ali, F. Yan, and E. Smirni, “Cedule+: Resource management for burstable cloud instances using predictive analytics,” IEEE Transactions on Network and Service Management, vol. 18, no. 1, pp. 945–957, 2020.
- S. N. A. Jawaddi, M. H. Johari, and A. Ismail, “A review of microservices autoscaling with formal verification perspective,” Software: Practice and Experience, 2022.
- Z. Wang, X. Tang, Q. Liu, and J. Han, “Jily: Cost-aware AutoScaling of heterogeneous GPU for DNN inference in public cloud,” in 2019 IEEE 38th International Performance Computing and Communications Conference (IPCCC). IEEE, 2019, pp. 1–8.
- J. P. K. S. Nunes, T. Bianchi, A. Y. Iwasaki, and E. Y. Nakagawa, “State of the art on microservices autoscaling: An overview,” Anais do XLVIII Seminário Integrado de Software e Hardware, pp. 30–38, 2021.
- T.-T. Nguyen, Y.-J. Yeom, T. Kim, D.-H. Park, and S. Kim, “Horizontal pod autoscaling in Kubernetes for elastic container orchestration,” Sensors, vol. 20, no. 16, p. 4621, 2020.
- J. Liu, S. Zhang, Q. Wang, and J. Wei, “Coordinating Fast Concurrency Adapting with Autoscaling for SLO-Oriented Web Applications,” IEEE Transactions on Parallel and Distributed Systems, 2022.
- F. Rossi, M. Nardelli, and V. Cardellini, “Horizontal and vertical scaling of container-based applications using reinforcement learning,” in 2019 IEEE 12th International Conference on Cloud Computing (CLOUD). IEEE, 2019, pp. 329–338.
- F. Rossi, V. Cardellini, F. L. Presti, and M. Nardelli, “Geo-distributed efficient deployment of containers with Kubernetes,” Computer Communications, vol. 159, pp. 161–174, 2020.
- F. Ebadifard, S. M. Babamir, and S. Barani, “A dynamic task scheduling algorithm improved by load balancing in cloud computing,” in 2020 6th International Conference on Web Research (ICWR). IEEE, 2020, pp. 177–183.
- A. De Santo, A. Galli, M. Gravina, V. Moscato, and G. Sperlì, “Deep learning for hdd health assessment: An application based on lstm,” IEEE Transactions on Computers, vol. 71, no. 1, pp. 69–80, 2020.
- K. Nguyen, S. Drew, C. Huang, and J. Zhou, “Collaborative container-based parked vehicle edge computing framework for online task offloading,” in 2020 IEEE 9th International Conference on Cloud Networking (CloudNet). IEEE, 2020, pp. 1–6.
- Y. Bao, Y. Peng, and C. Wu, “Deep learning-based job placement in distributed machine learning clusters,” in IEEE INFOCOM 2019-IEEE conference on computer communications. IEEE, 2019, pp. 505–513.
- A. Beltre, P. Saha, and M. Govindaraju, “Kubesphere: An approach to multi-tenant fair scheduling for kubernetes clusters,” in 2019 IEEE Cloud Summit. IEEE, 2019, pp. 14–20.
- M. F. Bestari, A. I. Kistijantoro, and A. B. Sasmita, “Dynamic Resource Scheduler for Distributed Deep Learning Training in Kubernetes,” in 2020 7th International Conference on Advance Informatics: Concepts, Theory and Applications (ICAICTA). IEEE, 2020, pp. 1–6.
- M. Carvalho and D. F. Macedo, “QoE-Aware Container Scheduler for Co-located Cloud Environments,” in 2021 IFIP/IEEE International Symposium on Integrated Network Management (IM). IEEE, 2021, pp. 286–294.
- L. Toka, “Ultra-reliable and low-latency computing in the edge with kubernetes,” Journal of Grid Computing, vol. 19, no. 3, pp. 1–23, 2021.
- A. Warke, M. Mohamed, R. Engel, H. Ludwig, W. Sawdon, and L. Liu, “Storage Service Orchestration with Container Elasticity,” in 2018 IEEE 4th International Conference on Collaboration and Internet Computing (CIC). IEEE, 2018, pp. 283–292.
- D. Santoro, D. Zozin, D. Pizzolli, F. De Pellegrini, and S. Cretti, “Foggy: a platform for workload orchestration in a fog computing environment,” in 2017 IEEE International Conference on Cloud Computing Technology and Science (CloudCom). IEEE, 2017, pp. 231–234.
- A. C. Caminero and R. Muñoz-Mansilla, “Quality of service provision in fog computing: Network-aware scheduling of containers,” Sensors, vol. 21, no. 12, p. 3978, 2021.
- N. D. Nguyen, L.-A. Phan, D.-H. Park, S. Kim, and T. Kim, “Elasticfog: Elastic resource provisioning in container-based fog computing,” IEEE Access, vol. 8, pp. 183 879–183 890, 2020.
- J. Santos, T. Wauters, B. Volckaert, and F. De Turck, “Towards network-aware resource provisioning in kubernetes for fog computing applications,” in 2019 IEEE Conference on Network Softwarization (NetSoft). IEEE, 2019, pp. 351–359.
- J. Santos, T. Wauters, B. Volckaert, and F. De Turck, “Resource provisioning in fog computing: From theory to practice,” Sensors, vol. 19, no. 10, p. 2238, 2019.
- K. Kaur, S. Garg, G. Kaddoum, S. H. Ahmed, and M. Atiquzzaman, “KEIDS: Kubernetes-based energy and interference driven scheduler for industrial IoT in edge-cloud ecosystem,” IEEE Internet of Things Journal, vol. 7, no. 5, pp. 4228–4237, 2019.
- Z. Zhong, J. He, M. A. Rodriguez, S. Erfani, R. Kotagiri, and R. Buyya, “Heterogeneous task co-location in containerized cloud computing environments,” in 2020 IEEE 23rd International Symposium on Real-Time Distributed Computing (ISORC). IEEE, 2020, pp. 79–88.
- Y.-G. Yim, H.-J. Jang, and H.-W. Jin, “QoS for best-effort batch jobs in container-based cloud,” Concurrency and Computation: Practice and Experience, 2021.
- P. Chhikara, R. Tekchandani, N. Kumar, and M. S. Obaidat, “An efficient container management scheme for resource-constrained intelligent IoT devices,” IEEE Internet of Things Journal, vol. 8, no. 16, pp. 12 597–12 609, 2020.
- M. Xu and R. Buyya, “Brownout approach for adaptive management of resources and applications in cloud computing systems: A taxonomy and future directions,” ACM Computing Surveys (CSUR), vol. 52, no. 1, pp. 1–27, 2019.
- J. R. Gunasekaran, P. Thinakaran, N. C. Nachiappan, M. T. Kandemir, and C. R. Das, “Fifer: Tackling resource underutilization in the serverless era,” in Proceedings of the 21st International Middleware Conference, 2020, pp. 280–295.
- P. Thinakaran, J. R. Gunasekaran, B. Sharma, M. T. Kandemir, and C. R. Das, “Kube-knots: Resource harvesting through dynamic container orchestration in gpu-based datacenters,” in 2019 IEEE International Conference on Cluster Computing (CLUSTER). IEEE, 2019, pp. 1–13.
- A. Chung, J. W. Park, and G. R. Ganger, “Stratus: Cost-aware container scheduling in the public cloud,” in Proceedings of the ACM symposium on cloud computing, 2018, pp. 121–134.
- Z. Zhong and R. Buyya, “A cost-efficient container orchestration strategy in kubernetes-based cloud computing infrastructures with heterogeneous resources,” ACM Transactions on Internet Technology (TOIT), vol. 20, no. 2, pp. 1–24, 2020.
- Y. Xu, J. Yao, H.-A. Jacobsen, and H. Guan, “Cost-efficient negotiation over multiple resources with reinforcement learning,” in 2017 IEEE/ACM 25th International Symposium on Quality of Service (IWQoS). IEEE, 2017, pp. 1–6.
- M. Xu, A. N. Toosi, and R. Buyya, “A self-adaptive approach for managing applications and harnessing renewable energy for sustainable cloud computing,” IEEE Transactions on Sustainable Computing, vol. 6, no. 4, pp. 544–558, 2020.
- I. Kim, K. J. Oh, and Y. I. Eom, “Overlit: New Storage Driver for Localization and Specialization,” in 2019 IEEE International Conference on Big Data and Smart Computing (BigComp). IEEE, 2019, pp. 1–4.
- K. Gos and W. Zabierowski, “The comparison of microservice and monolithic architecture,” in 2020 IEEE XVIth International Conference on the Perspective Technologies and Methods in MEMS Design (MEMSTECH). IEEE, 2020, pp. 150–153.
- L. De Lauretis, “From monolithic architecture to microservices architecture,” in 2019 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW). IEEE, 2019, pp. 93–96.
- F. Ponce, G. Márquez, and H. Astudillo, “Migrating from monolithic architecture to microservices: A rapid review,” in 2019 38th International Conference of the Chilean Computer Science Society (SCCC). IEEE, 2019, pp. 1–7.
- E. Jonas, J. Schleier-Smith, V. Sreekanti, C.-C. Tsai, A. Khandelwal, Q. Pu, V. Shankar, J. Carreira, K. Krauth, N. Yadwadkar et al., “Cloud programming simplified: A berkeley view on serverless computing,” arXiv preprint arXiv:1902.03383, 2019.
- N. Wang, R. Zhou, L. Jiao, R. Zhang, B. Li, and Z. Li, “Preemptive Scheduling for Distributed Machine Learning Jobs in Edge-Cloud Networks,” IEEE Journal on Selected Areas in Communications, 2022.
- A. Xu, Z. Huo, and H. Huang, “On the acceleration of deep learning model parallelism with staleness,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 2088–2097.
- Y. Huang, Y. Cheng, A. Bapna, O. Firat, D. Chen, M. Chen, H. Lee, J. Ngiam, Q. V. Le, Y. Wu et al., “Gpipe: Efficient training of giant neural networks using pipeline parallelism,” Advances in neural information processing systems, vol. 32, 2019.
- H. Wang, Z. Liu, and H. Shen, “Job scheduling for large-scale machine learning clusters,” in Proceedings of the 16th International Conference on emerging Networking EXperiments and Technologies, 2020, pp. 108–120.
- H. Albahar, S. Dongare, Y. Du, N. Zhao, A. K. Paul, and A. R. Butt, “SCHEDTUNE: A Heterogeneity-Aware GPU Scheduler for Deep Learning,” in 2022 22nd IEEE International Symposium on Cluster, Cloud and Internet Computing (CCGrid). IEEE, 2022, pp. 695–705.
- X. Wu, H. Xu, and Y. Wang, “Irina: Accelerating dnn inference with efficient online scheduling,” in 4th Asia-Pacific Workshop on Networking, 2020, pp. 36–43.
- H. Shen, L. Chen, Y. Jin, L. Zhao, B. Kong, M. Philipose, A. Krishnamurthy, and R. Sundaram, “Nexus: A gpu cluster engine for accelerating dnn-based video analysis,” in Proceedings of the 27th ACM Symposium on Operating Systems Principles, 2019, pp. 322–337.
- N. Zhou, Y. Georgiou, M. Pospieszny, L. Zhong, H. Zhou, C. Niethammer, B. Pejak, O. Marko, and D. Hoppe, “Container orchestration on hpc systems through kubernetes,” Journal of Cloud Computing, vol. 10, no. 1, pp. 1–14, 2021.
- C. Misale, M. Drocco, D. J. Milroy, C. E. A. Gutierrez, S. Herbein, D. H. Ahn, and Y. Park, “It’s a Scheduling Affair: GROMACS in the Cloud with the KubeFlux Scheduler,” in 2021 3rd International Workshop on Containers and New Orchestration Paradigms for Isolated Environments in HPC (CANOPIE-HPC). IEEE, 2021, pp. 10–16.
- C. Misale, D. J. Milroy, C. E. A. Gutierrez, M. Drocco, S. Herbein, D. H. Ahn, Z. Kaiser, and Y. Park, “Towards standard Kubernetes scheduling interfaces for converged computing,” in Smoky Mountains Computational Sciences and Engineering Conference. Springer, 2021, pp. 310–326.
- A. M. Beltre, P. Saha, M. Govindaraju, A. Younge, and R. E. Grant, “Enabling hpc workloads on cloud infrastructure using kubernetes container orchestration mechanisms,” in 2019 IEEE/ACM International Workshop on Containers and New Orchestration Paradigms for Isolated Environments in HPC (CANOPIE-HPC). IEEE, 2019, pp. 11–20.
- S. López-Huguet, J. D. Segrelles, M. Kasztelnik, M. Bubak, and I. Blanquer, “Seamlessly managing HPC workloads through Kubernetes,” in International Conference on High Performance Computing. Springer, 2020, pp. 310–320.
- S. Choochotkaew, T. Chiba, S. Trent, T. Yoshimura, and M. Amaral, “AutoDECK: Automated Declarative Performance Evaluation and Tuning Framework on Kubernetes,” in 2022 IEEE 15th International Conference on Cloud Computing (CLOUD). IEEE, 2022, pp. 309–314.
- D. Fan and D. He, “Knative Autoscaler Optimize Based on Double Exponential Smoothing,” in 2020 IEEE 5th Information Technology and Mechatronics Engineering Conference (ITOEC). IEEE, 2020, pp. 614–617.
- W. Ling, L. Ma, C. Tian, and Z. Hu, “Pigeon: A dynamic and efficient serverless and faas framework for private cloud,” in 2019 International Conference on Computational Science and Computational Intelligence (CSCI). IEEE, 2019, pp. 1416–1421.
- K. Kaffes, N. J. Yadwadkar, and C. Kozyrakis, “Centralized core-granular scheduling for serverless functions,” in Proceedings of the ACM symposium on cloud computing, 2019, pp. 158–164.
- S. Venkataraman, A. Panda, G. Ananthanarayanan, M. J. Franklin, and I. Stoica, “The power of choice in {{\{{Data-Aware}}\}} cluster scheduling,” in 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI 14), 2014, pp. 301–316.
- S. Venkataraman, A. Panda, K. Ousterhout, M. Armbrust, A. Ghodsi, M. J. Franklin, B. Recht, and I. Stoica, “Drizzle: Fast and adaptable stream processing at scale,” in Proceedings of the 26th Symposium on Operating Systems Principles, 2017, pp. 374–389.
- H. Wang, D. Niu, and B. Li, “Distributed machine learning with a serverless architecture,” in IEEE INFOCOM 2019-IEEE Conference on Computer Communications. IEEE, 2019, pp. 1288–1296.
- R. Gu, K. Zhang, Z. Xu, Y. Che, B. Fan, H. Hou, H. Dai, L. Yi, Y. Ding, G. Chen, and Others, “Fluid: Dataset abstraction and elastic acceleration for cloud-native deep learning training jobs,” in 2022 IEEE 38th International Conference on Data Engineering (ICDE). IEEE, 2022, pp. 2182–2195.
- X. Zhang, L. Li, Y. Wang, E. Chen, and L. Shou, “Zeus: Improving resource efficiency via workload colocation for massive kubernetes clusters,” IEEE Access, vol. 9, pp. 105 192–105 204, 2021.
- X. Chen, L. Cheng, C. Liu, Q. Liu, J. Liu, Y. Mao, and J. Murphy, “A WOA-based optimization approach for task scheduling in cloud computing systems,” IEEE Systems Journal, vol. 14, no. 3, pp. 3117–3128, 2020.
- H. Zhao, S. Deng, Z. Liu, J. Yin, and S. Dustdar, “Distributed redundant placement for microservice-based applications at the edge,” arXiv preprint arXiv:1911.03600, 2019.
- L. Ye, Y. Xia, L. Yang, and C. Yan, “SHWS: Stochastic Hybrid Workflows Dynamic Scheduling in Cloud Container Services,” IEEE Transactions on Automation Science and Engineering, 2021.
- Y. Hu, H. Zhou, C. de Laat, and Z. Zhao, “Concurrent container scheduling on heterogeneous clusters with multi-resource constraints,” Future Generation Computer Systems, vol. 102, pp. 562–573, 2020.
- M. Yu, Y. Tian, B. Ji, C. Wu, H. Rajan, and J. Liu, “Gadget: Online resource optimization for scheduling ring-all-reduce learning jobs,” in IEEE INFOCOM 2022-IEEE Conference on Computer Communications. IEEE, 2022, pp. 1569–1578.
- Y. Mao, Y. Fu, W. Zheng, L. Cheng, Q. Liu, and D. Tao, “Speculative container scheduling for deep learning applications in a kubernetes cluster,” IEEE Systems Journal, 2021.
- W. Zheng, M. Tynes, H. Gorelick, Y. Mao, L. Cheng, and Y. Hou, “Flowcon: Elastic flow configuration for containerized deep learning applications,” in Proceedings of the 48th International Conference on Parallel Processing, 2019, pp. 1–10.
- H. Zheng, F. Xu, L. Chen, Z. Zhou, and F. Liu, “Cynthia: Cost-efficient cloud resource provisioning for predictable distributed deep neural network training,” in Proceedings of the 48th International Conference on Parallel Processing, 2019, pp. 1–11.
- D. Narayanan, A. Harlap, A. Phanishayee, V. Seshadri, N. R. Devanur, G. R. Ganger, P. B. Gibbons, and M. Zaharia, “Pipedream: generalized pipeline parallelism for dnn training,” in Proceedings of the 27th ACM Symposium on Operating Systems Principles, 2019, pp. 1–15.
- P. Li, E. Koyuncu, and H. Seferoglu, “Respipe: Resilient model-distributed dnn training at edge networks,” in ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2021, pp. 3660–3664.
- Z. Luo, X. Yi, G. Long, S. Fan, C. Wu, J. Yang, and W. Lin, “Efficient pipeline planning for expedited distributed dnn training,” arXiv preprint arXiv:2204.10562, 2022.
- X. Yi, Z. Luo, C. Meng, M. Wang, G. Long, C. Wu, J. Yang, and W. Lin, “Fast training of deep learning models over multiple gpus,” in Proceedings of the 21st International Middleware Conference, 2020, pp. 105–118.
- C. Olston, N. Fiedel, K. Gorovoy, J. Harmsen, L. Lao, F. Li, V. Rajashekhar, S. Ramesh, and J. Soyke, “Tensorflow-serving: Flexible, high-performance ml serving,” arXiv preprint arXiv:1712.06139, 2017.
- “Multi model server: a tool for serving neural net models for inference,” 2022. [Online]. Available: https://github.com/awslabs/multi-modelserver
- T. Liang, J. Glossner, L. Wang, S. Shi, and X. Zhang, “Pruning and quantization for deep neural network acceleration: A survey,” Neurocomputing, vol. 461, pp. 370–403, 2021.
- A. Gholami, S. Kim, Z. Dong, Z. Yao, M. W. Mahoney, and K. Keutzer, “A survey of quantization methods for efficient neural network inference,” arXiv preprint arXiv:2103.13630, 2021.
- Y. Choi and M. Rhu, “Prema: A predictive multi-task scheduling algorithm for preemptible neural processing units,” in 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, 2020, pp. 220–233.
- D. Mendoza, F. Romero, Q. Li, N. J. Yadwadkar, and C. Kozyrakis, “Interference-aware scheduling for inference serving,” in Proceedings of the 1st Workshop on Machine Learning and Systems, 2021, pp. 80–88.
- N. Zhou, H. Zhou, and D. Hoppe, “Containerisation for high performance computing systems: Survey and prospects,” IEEE Transactions on Software Engineering, 2022.
- A. Reuther, C. Byun, W. Arcand, D. Bestor, B. Bergeron, M. Hubbell, M. Jones, P. Michaleas, A. Prout, A. Rosa et al., “Scalable system scheduling for hpc and big data,” Journal of Parallel and Distributed Computing, vol. 111, pp. 76–92, 2018.
- M. A. Netto, R. N. Calheiros, E. R. Rodrigues, R. L. Cunha, and R. Buyya, “Hpc cloud for scientific and business applications: taxonomy, vision, and research challenges,” ACM Computing Surveys (CSUR), vol. 51, no. 1, pp. 1–29, 2018.
- Y. Fan, “Job scheduling in high performance computing,” arXiv preprint arXiv:2109.09269, 2021.
- Q. Wofford, P. G. Bridges, and P. Widener, “A layered approach for modular container construction and orchestration in hpc environments,” in Proceedings of the 11th Workshop on Scientific Cloud Computing, 2020, pp. 1–8.
- S. Julian, M. Shuey, and S. Cook, “Containers in research: initial experiences with lightweight infrastructure,” in Proceedings of the XSEDE16 Conference on Diversity, Big Data, and Science at Scale, 2016, pp. 1–6.
- J. Higgins, V. Holmes, and C. Venters, “Orchestrating docker containers in the hpc environment,” in International Conference on High Performance Computing. Springer, 2015, pp. 506–513.
- N. Zhou, Y. Georgiou, L. Zhong, H. Zhou, and M. Pospieszny, “Container orchestration on hpc systems,” in 2020 IEEE 13th International Conference on Cloud Computing (CLOUD). IEEE, 2020, pp. 34–36.
- N. Zhou, “Containerization and orchestration on hpc systems,” in Sustained Simulation Performance 2019 and 2020. Springer, 2021, pp. 133–147.
- N. Zhou, L. Zhong, D. Hoppe, B. Pejak, O. Marko, J. Cardona, M. Czerkawski, I. Andonovic, C. Michie, C. Tachtatzis et al., “Cybele:: A hybrid architecture of hpc and big data for ai applications in agriculture,” in HPC, Big Data, and AI Convergence Towards Exascale. CRC Press, 2022, pp. 255–272.
- F. Liu, K. Keahey, P. Riteau, and J. Weissman, “Dynamically negotiating capacity between on-demand and batch clusters,” in SC18: International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE, 2018, pp. 493–503.
- M. E. Piras, L. Pireddu, M. Moro, and G. Zanetti, “Container orchestration on hpc clusters,” in International Conference on High Performance Computing. Springer, 2019, pp. 25–35.
- J. Carnero and F. J. Nieto, “Running simulations in hpc and cloud resources by implementing enhanced tosca workflows,” in 2018 International Conference on High Performance Computing & Simulation (HPCS). IEEE, 2018, pp. 431–438.
- E. Di Nitto, J. Gorroñogoitia, I. Kumara, G. Meditskos, D. Radolović, K. Sivalingam, and R. S. González, “An approach to support automated deployment of applications on heterogeneous cloud-hpc infrastructures,” in 2020 22nd International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC). IEEE, 2020, pp. 133–140.
- I. Colonnelli, B. Cantalupo, I. Merelli, and M. Aldinucci, “Streamflow: cross-breeding cloud with hpc,” IEEE Transactions on Emerging Topics in Computing, vol. 9, no. 4, pp. 1723–1737, 2020.
- P. Di Tommaso, M. Chatzou, E. W. Floden, P. P. Barja, E. Palumbo, and C. Notredame, “Nextflow enables reproducible computational workflows,” Nature biotechnology, vol. 35, no. 4, pp. 316–319, 2017.
- “kube-batch,” 2019. [Online]. Available: https://github.com/kubernetes-sigs/kube-batch
- ©Kubeless 2022 project authors, “The kubernetes native serverless framework,” 2022, https://kubeless.io/.
- H. Govind and H. González-Vélez, “Benchmarking serverless workloads on kubernetes,” in 2021 IEEE/ACM 21st International Symposium on Cluster, Cloud and Internet Computing (CCGrid). IEEE, 2021, pp. 704–712.
- K. Djemame, M. Parker, and D. Datsev, “Open-source serverless architectures: an evaluation of apache openwhisk,” in 2020 IEEE/ACM 13th International Conference on Utility and Cloud Computing (UCC). IEEE, 2020, pp. 329–335.
- S. K. Mohanty, G. Premsankar, M. Di Francesco et al., “An evaluation of open source serverless computing frameworks.” CloudCom, vol. 2018, pp. 115–120, 2018.
- O. Mashayekhi, H. Qu, C. Shah, and P. Levis, “Execution templates: Caching control plane decisions for strong scaling of data analytics,” in 2017 USENIX Annual Technical Conference (USENIX ATC 17), 2017, pp. 513–526.
- J. Huang, C. Xiao, and W. Wu, “Rlsk: a job scheduler for federated kubernetes clusters based on reinforcement learning,” in 2020 IEEE International Conference on Cloud Engineering (IC2E). IEEE, 2020, pp. 116–123.
- Z. Gu, S. Tang, B. Jiang, S. Huang, Q. Guan, and S. Fu, “Characterizing Job-Task Dependency in Cloud Workloads Using Graph Learning,” in 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW). IEEE, 2021, pp. 288–297.
- “Hadoop: Fair scheduler,” April 2019. [Online]. Available: https://hadoop.apache.org/docs/current/hadoopyarn/hadoop-yarnsite/FairScheduler.html
- W. Song, Z. Xiao, Q. Chen, and H. Luo, “Adaptive resource provisioning for the cloud using online bin packing,” IEEE Transactions on Computers, vol. 63, no. 11, pp. 2647–2660, 2013.
- R. Grandl, G. Ananthanarayanan, S. Kandula, S. Rao, and A. Akella, “Multi-resource packing for cluster schedulers,” ACM SIGCOMM Computer Communication Review, vol. 44, no. 4, pp. 455–466, 2014.
- S. Wang, Z. Ding, and C. Jiang, “Elastic scheduling for microservice applications in clouds,” IEEE Transactions on Parallel and Distributed Systems, vol. 32, no. 1, pp. 98–115, 2020.
- D. Zhao, M. Mohamed, and H. Ludwig, “Locality-aware scheduling for containers in cloud computing,” IEEE Transactions on cloud computing, vol. 8, no. 2, pp. 635–646, 2018.
- L. Bulej, T. Bureš, P. Hnětynka, and D. Khalyeyev, “Self-adaptive K8S Cloud Controller for Time-sensitive Applications,” in 2021 47th Euromicro Conference on Software Engineering and Advanced Applications (SEAA). IEEE, 2021, pp. 166–169.
- G. Ambrosino, G. B. Fioccola, R. Canonico, and G. Ventre, “Container Mapping and its Impact on Performance in Containerized Cloud Environments,” in 2020 IEEE International Conference on Service Oriented Systems Engineering (SOSE). IEEE, 2020, pp. 57–64.
- H. Zhao, S. Deng, F. Chen, J. Yin, S. Dustdar, and A. Y. Zomaya, “Learning to Schedule Multi-Server Jobs With Fluctuated Processing Speeds,” IEEE Transactions on Parallel and Distributed Systems, 2022.
- F. Rossi, V. Cardellini, and F. L. Presti, “Elastic deployment of software containers in geo-distributed computing environments,” in 2019 IEEE Symposium on Computers and Communications (ISCC). IEEE, 2019, pp. 1–7.
- A. Das, S. Imai, S. Patterson, and M. P. Wittie, “Performance optimization for edge-cloud serverless platforms via dynamic task placement,” in 2020 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID). IEEE, 2020, pp. 41–50.
- N. Akhtar, A. Raza, V. Ishakian, and I. Matta, “Cose: Configuring serverless functions using statistical learning,” in IEEE INFOCOM 2020-IEEE Conference on Computer Communications. IEEE, 2020, pp. 129–138.
- Y. Aldwyan, R. O. Sinnott, and G. T. Jayaputera, “Elastic deployment of container clusters across geographically distributed cloud data centers for web applications,” Concurrency and Computation: Practice and Experience, vol. 33, no. 21, p. e6436, 2021.
- T. Shi, H. Ma, G. Chen, and S. Hartmann, “Location-Aware and Budget-Constrained Service Brokering in Multi-Cloud via Deep Reinforcement Learning,” in International Conference on Service-Oriented Computing. Springer, 2021, pp. 756–764.
- ——, “Location-aware and budget-constrained service deployment for composite applications in multi-cloud environment,” IEEE Transactions on Parallel and Distributed Systems, vol. 31, no. 8, pp. 1954–1969, 2020.
- H. Sami, A. Mourad, H. Otrok, and J. Bentahar, “Fscaler: Automatic resource scaling of containers in fog clusters using reinforcement learning,” in 2020 international wireless communications and mobile computing (IWCMC). IEEE, 2020, pp. 1824–1829.
- M. Yan, X. Liang, Z. Lu, J. Wu, and W. Zhang, “Hansel: Adaptive horizontal scaling of microservices using bi-lstm,” Applied Soft Computing, vol. 105, p. 107216, 2021.
- A. Marchese and O. Tomarchio, “Network-Aware Container Placement in Cloud-Edge Kubernetes Clusters,” in 2022 22nd IEEE International Symposium on Cluster, Cloud and Internet Computing (CCGrid). IEEE, 2022, pp. 859–865.
- S. Wang, Y. Guo, N. Zhang, P. Yang, A. Zhou, and X. Shen, “Delay-aware microservice coordination in mobile edge computing: A reinforcement learning approach,” IEEE Transactions on Mobile Computing, vol. 20, no. 3, pp. 939–951, 2019.
- T. Pusztai, S. Nastic, A. Morichetta, V. C. Pujol, P. Raith, S. Dustdar, D. Vij, Y. Xiong, and Z. Zhang, “Polaris scheduler: Slo-and topology-aware microservices scheduling at the edge,” in 2022 IEEE/ACM 15th International Conference on Utility and Cloud Computing (UCC). IEEE, 2022, pp. 61–70.
- T. Pusztai, F. Rossi, and S. Dustdar, “Pogonip: Scheduling asynchronous applications on the edge,” in 2021 IEEE 14th International Conference on Cloud Computing (CLOUD). IEEE, 2021, pp. 660–670.
- S. Nastic, T. Pusztai, A. Morichetta, V. C. Pujol, S. Dustdar, D. Vii, and Y. Xiong, “Polaris scheduler: Edge sensitive and slo aware workload scheduling in cloud-edge-iot clusters,” in 2021 IEEE 14th International Conference on Cloud Computing (CLOUD). IEEE, 2021, pp. 206–216.
- Z. Tang, X. Zhou, F. Zhang, W. Jia, and W. Zhao, “Migration modeling and learning algorithms for containers in fog computing,” IEEE Transactions on Services Computing, vol. 12, no. 5, pp. 712–725, 2018.
- Z. Miao, P. Yong, Y. Mei, Y. Quanjun, and X. Xu, “A discrete pso-based static load balancing algorithm for distributed simulations in a cloud environment,” Future Generation Computer Systems, vol. 115, pp. 497–516, 2021.
- H. Lu, G. Xu, C. W. Sung, S. Mostafa, and Y. Wu, “A game theoretical balancing approach for offloaded tasks in edge datacenters,” in 2022 IEEE 42nd International Conference on Distributed Computing Systems (ICDCS). IEEE, 2022, pp. 526–536.
- M. Kumar and S. C. Sharma, “Deadline constrained based dynamic load balancing algorithm with elasticity in cloud environment,” Computers & Electrical Engineering, vol. 69, pp. 395–411, 2018.
- R. Yu, V. T. Kilari, G. Xue, and D. Yang, “Load balancing for interdependent iot microservices,” in IEEE INFOCOM 2019-IEEE Conference on Computer Communications. IEEE, 2019, pp. 298–306.
- L. Huang, S. Cheng, Y. Guan, X. Zhang, and Z. Guo, “Consistent user-traffic allocation and load balancing in mobile edge caching,” in IEEE INFOCOM 2020-IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS). IEEE, 2020, pp. 592–597.
- J. Wang, G. Zhao, H. Xu, H. Huang, L. Luo, and Y. Yang, “Robust service mapping in multi-tenant clouds,” in IEEE INFOCOM 2021-IEEE Conference on Computer Communications. IEEE, 2021, pp. 1–10.
- J. O. Gutierrez-Garcia and A. Ramirez-Nafarrate, “Agent-based load balancing in cloud data centers,” Cluster Computing, vol. 18, no. 3, pp. 1041–1062, 2015.
- H. Menon and L. Kalé, “A distributed dynamic load balancer for iterative applications,” in SC’13: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis. IEEE, 2013, pp. 1–11.
- X. Xu, Q. Jiang, P. Zhang, X. Cao, M. R. Khosravi, L. T. Alex, L. Qi, and W. Dou, “Game theory for distributed iov task offloading with fuzzy neural network in edge computing,” IEEE Transactions on Fuzzy Systems, vol. 30, no. 11, pp. 4593–4604, 2022.
- Z. Yao, Z. Ding, and T. Clausen, “Multi-agent reinforcement learning for network load balancing in data center,” in Proceedings of the 31st ACM International Conference on Information & Knowledge Management, 2022, pp. 3594–3603.
- A. Asghari and M. K. Sohrabi, “Combined use of coral reefs optimization and multi-agent deep q-network for energy-aware resource provisioning in cloud data centers using dvfs technique,” Cluster Computing, vol. 25, no. 1, pp. 119–140, 2022.
- O. Houidi, S. Bakri, and D. Zeghlache, “Multi-agent graph convolutional reinforcement learning for intelligent load balancing,” in NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium. IEEE, 2022, pp. 1–6.
- A. Shribman and B. Hudzia, “Pre-copy and post-copy vm live migration for memory intensive applications,” in European Conference on Parallel Processing. Springer, 2012, pp. 539–547.
- D. Fernando, J. Terner, K. Gopalan, and P. Yang, “Live migration ate my vm: Recovering a virtual machine after failure of post-copy live migration,” in IEEE INFOCOM 2019-IEEE Conference on Computer Communications. IEEE, 2019, pp. 343–351.
- C. C. Chou, Y. Chen, D. Milojicic, N. Reddy, and P. Gratz, “Optimizing post-copy live migration with system-level checkpoint using fabric-attached memory,” in 2019 IEEE/ACM Workshop on Memory Centric High Performance Computing (MCHPC). IEEE, 2019, pp. 16–24.
- C. Jo, Y. Cho, and B. Egger, “A machine learning approach to live migration modeling,” in Proceedings of the 2017 Symposium on Cloud Computing, 2017, pp. 351–364.
- N. T. Khai, A. Baumgartner, and T. Bauschert, “A multi-step model for migration and resource reallocation in virtualized network infrastructures,” in 2017 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS). IEEE, 2017, pp. 730–735.
- A. Ruprecht, D. Jones, D. Shiraev, G. Harmon, M. Spivak, M. Krebs, M. Baker-Harvey, and T. Sanderson, “Vm live migration at scale,” ACM SIGPLAN Notices, vol. 53, no. 3, pp. 45–56, 2018.
- C. Li, D. Feng, Y. Hua, W. Xia, L. Qin, Y. Huang, and Y. Zhou, “Bac: Bandwidth-aware compression for efficient live migration of virtual machines,” in IEEE INFOCOM 2017-IEEE Conference on Computer Communications. IEEE, 2017, pp. 1–9.
- F. Le and E. M. Nahum, “Experiences implementing live vm migration over the wan with multi-path tcp,” in IEEE INFOCOM 2019-IEEE Conference on Computer Communications. IEEE, 2019, pp. 1090–1098.
- D. Basu, X. Wang, Y. Hong, H. Chen, and S. Bressan, “Learn-as-you-go with megh: Efficient live migration of virtual machines,” IEEE Transactions on Parallel and Distributed Systems, vol. 30, no. 8, pp. 1786–1801, 2019.
- T. Benjaponpitak, M. Karakate, and K. Sripanidkulchai, “Enabling live migration of containerized applications across clouds,” in IEEE INFOCOM 2020-IEEE Conference on Computer Communications. IEEE, 2020, pp. 2529–2538.
- P. K. Sinha, S. S. Doddamani, H. Lu, and K. Gopalan, “mwarp: accelerating intra-host live container migration via memory warping,” in IEEE INFOCOM 2019-IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS). IEEE, 2019, pp. 508–513.
- B. Xu, S. Wu, J. Xiao, H. Jin, Y. Zhang, G. Shi, T. Lin, J. Rao, L. Yi, and J. Jiang, “Sledge: Towards efficient live migration of docker containers,” in 2020 IEEE 13th International Conference on Cloud Computing (CLOUD). IEEE, 2020, pp. 321–328.
- R. Torre, E. Urbano, H. Salah, G. T. Nguyen, and F. H. Fitzek, “Towards a better understanding of live migration performance with docker containers,” in European Wireless 2019; 25th European Wireless Conference. VDE, 2019, pp. 1–6.
- F. Romero, Q. Li, N. J. Yadwadkar, and C. Kozyrakis, “{{\{{INFaaS}}\}}: Automated model-less inference serving,” in 2021 USENIX Annual Technical Conference (USENIX ATC 21), 2021, pp. 397–411.
- C. Zhang, M. Yu, W. Wang, and F. Yan, “{{\{{MArk}}\}}: Exploiting cloud services for {{\{{Cost-Effective}}\}},{{\{{SLO-Aware}}\}} machine learning inference serving,” in 2019 USENIX Annual Technical Conference (USENIX ATC 19), 2019, pp. 1049–1062.
- S. Shillaker and P. Pietzuch, “Faasm: Lightweight isolation for efficient stateful serverless computing,” in 2020 USENIX Annual Technical Conference (USENIX ATC 20), 2020, pp. 419–433.
- N. Akhtar, I. Matta, A. Raza, and Y. Wang, “El-sec: Elastic management of security applications on virtualized infrastructure,” in IEEE INFOCOM 2018-IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS). IEEE, 2018, pp. 778–783.
- G. R. Russo, V. Cardellini, G. Casale, and F. L. Presti, “Mead: Model-based vertical auto-scaling for data stream processing,” in 2021 IEEE/ACM 21st International Symposium on Cluster, Cloud and Internet Computing (CCGrid). IEEE, 2021, pp. 314–323.
- E. B. Lakew, A. V. Papadopoulos, M. Maggio, C. Klein, and E. Elmroth, “Kpi-agnostic control for fine-grained vertical elasticity,” in 2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID). IEEE, 2017, pp. 589–598.
- Y. Sfakianakis, M. Marazakis, C. Kozanitis, and A. Bilas, “Latest: Vertical elasticity for millisecond serverless execution,” in 2022 22nd IEEE International Symposium on Cluster, Cloud and Internet Computing (CCGrid). IEEE, 2022, pp. 879–885.
- S. K. Tesfatsion, L. Tomás, and J. Tordsson, “Optibook: Optimal resource booking for energy-efficient datacenters,” in 2017 IEEE/ACM 25th International Symposium on Quality of Service (IWQoS). IEEE, 2017, pp. 1–10.
- X. Fei, F. Liu, H. Xu, and H. Jin, “Adaptive vnf scaling and flow routing with proactive demand prediction,” in IEEE INFOCOM 2018-IEEE Conference on Computer Communications. IEEE, 2018, pp. 486–494.
- M. Avgeris, D. Dechouniotis, N. Athanasopoulos, and S. Papavassiliou, “Adaptive resource allocation for computation offloading: A control-theoretic approach,” ACM Transactions on Internet Technology (TOIT), vol. 19, no. 2, pp. 1–20, 2019.
- R. Mahmud, K. Ramamohanarao, and R. Buyya, “Latency-aware application module management for fog computing environments,” ACM Transactions on Internet Technology (TOIT), vol. 19, no. 1, pp. 1–21, 2018.
- L. Schuler, S. Jamil, and N. Kühl, “Ai-based resource allocation: Reinforcement learning for adaptive auto-scaling in serverless environments,” in 2021 IEEE/ACM 21st International Symposium on Cluster, Cloud and Internet Computing (CCGrid). IEEE, 2021, pp. 804–811.
- “F5 nginx management suite.” [Online]. Available: https://www.nginx.com/
- “Haproxy:the reliable, high performance tcp/http load balancer.” [Online]. Available: https://www.haproxy.org/
- D. E. Eisenbud, C. Yi, C. Contavalli, C. Smith, R. Kononov, E. Mann-Hielscher, A. Cilingiroglu, B. Cheyney, W. Shang, and J. D. Hosein, “Maglev: A fast and reliable software network load balancer,” in 13th {normal-{\{{USENIX}normal-}\}} Symposium on Networked Systems Design and Implementation ({normal-{\{{NSDI}normal-}\}} 16), 2016, pp. 523–535.