Network-Aware Reliability Modeling and Optimization for Microservice Placement (2405.18001v1)
Abstract: Optimizing microservice placement to enhance the reliability of services is crucial for improving the service level of microservice architecture-based mobile networks and Internet of Things (IoT) networks. Despite extensive research on service reliability, the impact of network load and routing on service reliability remains understudied, leading to suboptimal models and unsatisfactory performance. To address this issue, we propose a novel network-aware service reliability model that effectively captures the correlation between network state changes and reliability. Based on this model, we formulate the microservice placement problem as an integer nonlinear programming problem, aiming to maximize service reliability. Subsequently, a service reliability-aware placement (SRP) algorithm is proposed to solve the problem efficiently. To reduce bandwidth consumption, we further discuss the microservice placement problem with the shared backup path mechanism and propose a placement algorithm based on the SRP algorithm using shared path reliability calculation, known as the SRP-S algorithm. Extensive simulations demonstrate that the SRP algorithm reduces service failures by up to 29% compared to the benchmark algorithms. By introducing the shared backup path mechanism, the SRP-S algorithm reduces bandwidth consumption by up to 62% compared to the SRP algorithm with the fully protected path mechanism. It also reduces service failures by up to 21% compared to the SRP algorithm with the shared backup mechanism.
- M. Usman, S. Ferlin, A. Brunstrom, and J. Taheri, “A survey on observability of distributed edge & container-based microservices,” IEEE Access, vol. 10, pp. 86904–86919, 2022.
- M. Söylemez, B. Tekinerdogan, and A. Kolukısa Tarhan, “Challenges and solution directions of microservice architectures: A systematic literature review,” Applied Sciences, vol. 12, no. 11, p. 5507, 2022.
- K. Kaur, F. Guillemin, and F. Sailhan, “Container placement and migration strategies for cloud, fog, and edge data centers: A survey,” International Journal of Network Management, vol. 32, no. 6, p. e2212, 2022.
- H. Siddiqui, F. Khendek, and M. Toeroe, “Microservices based architectures for iot systems - state-of-the-art review,” Internet of Things, vol. 23, p. 100854, 2023.
- R. Kumar and N. Agrawal, “Analysis of multi-dimensional industrial iot (iiot) data in edge–fog–cloud based architectural frameworks : A survey on current state and research challenges,” Journal of Industrial Information Integration, vol. 35, p. 100504, 2023.
- Y. Chen, H. Lu, L. Qin, C. Zhang, and C. W. Chen, “Statistical qos provisioning analysis and performance optimization in xurllc-enabled massive mu-mimo networks: A stochastic network calculus perspective,” IEEE Transactions on Wireless Communications, pp. 1–1, 2024.
- S. Pallewatta, V. Kostakos, and R. Buyya, “Placement of microservices-based iot applications in fog computing: A taxonomy and future directions,” ACM Comput. Surv., vol. 55, jul 2023.
- Y. Zeng, Z. Qu, S. Guo, B. Ye, J. Zhang, J. Li, and B. Tang, “Safedrl: Dynamic microservice provisioning with reliability and latency guarantees in edge environments,” IEEE Transactions on Computers, vol. 73, no. 1, pp. 235–248, 2024.
- Y. Wang, L. Zhang, P. Yu, K. Chen, X. Qiu, L. Meng, M. Kadoch, and M. Cheriet, “Reliability-oriented and resource-efficient service function chain construction and backup,” IEEE Transactions on Network and Service Management, vol. 18, no. 1, pp. 240–257, 2021.
- G. Baranwal and D. P. Vidyarthi, “Trappy: a truthfulness and reliability aware application placement policy in fog computing,” The Journal of Supercomputing, vol. 78, pp. 7861–7887, Apr 2022.
- Y. Qiu, J. Liang, V. C. Leung, X. Wu, and X. Deng, “Online reliability-enhanced virtual network services provisioning in fault-prone mobile edge cloud,” IEEE Transactions on Wireless Communications, vol. 21, no. 9, pp. 7299–7313, 2022.
- M. Zhu, F. He, and E. Oki, “Resource allocation model against multiple failures with workload-dependent failure probability,” IEEE Transactions on Network and Service Management, vol. 19, no. 2, pp. 1098–1116, 2022.
- L. Rui, X. Chen, X. Wang, Z. Gao, X. Qiu, and S. Wang, “Multiservice reliability evaluation algorithm considering network congestion and regional failure based on petri net,” IEEE Transactions on Services Computing, vol. 15, no. 2, pp. 684–697, 2022.
- Z. Liu, S. Yang, M. Yang, and R. Kang, “Software belief reliability growth model based on uncertain differential equation,” IEEE Transactions on Reliability, vol. 71, no. 2, pp. 775–787, 2022.
- X. Qiu, Y. Dai, Y. Xiang, and L. Xing, “A hierarchical correlation model for evaluating reliability, performance, and power consumption of a cloud service,” IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 46, no. 3, pp. 401–412, 2016.
- B. Schroeder and G. A. Gibson, “A large-scale study of failures in high-performance computing systems,” IEEE Transactions on Dependable and Secure Computing, vol. 7, no. 4, pp. 337–350, 2010.
- A. Zhou, S. Wang, B. Cheng, Z. Zheng, F. Yang, R. N. Chang, M. R. Lyu, and R. Buyya, “Cloud service reliability enhancement via virtual machine placement optimization,” IEEE Transactions on Services Computing, vol. 10, no. 6, pp. 902–913, 2017.
- L. Zhu, Q. Zhuang, H. Jiang, H. Liang, X. Gao, and W. Wang, “Reliability-aware failure recovery for cloud computing based automatic train supervision systems in urban rail transit using deep reinforcement learning,” Journal of Cloud Computing, vol. 12, no. 1, p. 147, 2023.
- Z. Liu, G. Fan, H. Yu, and L. Chen, “An approach to modeling and analyzing reliability for microservice-oriented cloud applications,” Wireless Communications and Mobile Computing, vol. 2021, p. 5750646, Aug 2021.
- H. Huang, H. Zhang, T. Guo, J. Guo, and C. He, “Reliable redundant services placement in federated micro-clouds,” in 2019 IEEE 25th International Conference on Parallel and Distributed Systems (ICPADS), pp. 446–453, IEEE, 2019.
- M. Ibrar, L. Wang, N. Shah, O. Rottenstreich, G.-M. Muntean, and A. Akbar, “Reliability-aware flow distribution algorithm in sdn-enabled fog computing for smart cities,” IEEE Transactions on Vehicular Technology, vol. 72, no. 1, pp. 573–588, 2023.
- J. Paul Martin, A. Kandasamy, and K. Chandrasekaran, “Crew: cost and reliability aware eagle-whale optimiser for service placement in fog,” Software: Practice and Experience, vol. 50, no. 12, pp. 2337–2360, 2020.
- M. Dadashi and A. Rajabzadeh, “Daip: a delay-efficient and availability-aware iot application placement in fog environments,” Computing, vol. 105, pp. 2007–2035, Sep 2023.
- Y. Ramzanpoor, M. Hosseini Shirvani, and M. Golsorkhtabaramiri, “Multi-objective fault-tolerant optimization algorithm for deployment of iot applications on fog computing infrastructure,” Complex & Intelligent Systems, vol. 8, no. 1, pp. 361–392, 2022.
- Y. Qiu, J. Liang, V. C. M. Leung, X. Wu, and X. Deng, “Online reliability-enhanced virtual network services provisioning in fault-prone mobile edge cloud,” IEEE Transactions on Wireless Communications, vol. 21, no. 9, pp. 7299–7313, 2022.
- A. Markopoulou, G. Iannaccone, S. Bhattacharyya, C.-N. Chuah, Y. Ganjali, and C. Diot, “Characterization of failures in an operational ip backbone network,” IEEE/ACM Transactions on Networking, vol. 16, no. 4, pp. 749–762, 2008.
- G. Le, S. Ferdousi, A. Marotta, S. Xu, Y. Hirota, Y. Awaji, S. Savas, M. Tornatore, and B. Mukherjee, “Reliable provisioning with degraded service using multipath routing from multiple data centers in optical metro networks,” IEEE Transactions on Network and Service Management, vol. 20, no. 3, pp. 3334–3347, 2023.
- R. S. Guimarães, C. Dominicini, V. M. G. Martínez, B. M. Xavier, D. R. Mafioletti, A. C. Locateli, R. Villaca, M. Martinello, and M. R. N. Ribeiro, “M-polka: Multipath polynomial key-based source routing for reliable communications,” IEEE Transactions on Network and Service Management, vol. 19, no. 3, pp. 2639–2651, 2022.
- L. Qu, C. Assi, M. J. Khabbaz, and Y. Ye, “Reliability-aware service function chaining with function decomposition and multipath routing,” IEEE Transactions on Network and Service Management, vol. 17, no. 2, pp. 835–848, 2020.
- L. Tang, G. Zhao, C. Wang, P. Zhao, and Q. Chen, “Queue-aware reliable embedding algorithm for 5g network slicing,” Computer Networks, vol. 146, pp. 138–150, 2018.
- Y. Al Mtawa, A. Haque, and H. Lutfiyya, “Migrating from legacy to software defined networks: A network reliability perspective,” IEEE Transactions on Reliability, vol. 70, no. 4, pp. 1525–1541, 2021.
- H. Zhao, S. Deng, Z. Liu, J. Yin, and S. Dustdar, “Distributed redundant placement for microservice-based applications at the edge,” IEEE Transactions on Services Computing, vol. 15, no. 3, pp. 1732–1745, 2022.
- M.-Y. Saidi and B. Cousin, “Resource saving: Which resource sharing strategy to protect primary shortest paths?,” in 2016 13th IEEE Annual Consumer Communications & Networking Conference (CCNC), pp. 297–298, 2016.
- W. Zheng, M. Yang, C. Zhang, Y. Zheng, and Y. Zhang, “Robust design against network failures of shared backup path protected sdm-eons,” Journal of Lightwave Technology, vol. 41, no. 10, pp. 2923–2939, 2023.
- D. Ergenç, J. Rak, and M. Fischer, “Service-based resilience via shared protection in mission-critical embedded networks,” IEEE Transactions on Network and Service Management, vol. 18, no. 3, pp. 2687–2701, 2021.
- J. Edmonds and R. M. Karp, “Theoretical improvements in algorithmic efficiency for network flow problems,” Journal of the ACM (JACM), vol. 19, no. 2, pp. 248–264, 1972.
- F. Zhang, “Microservice placement simulations.” https://github.com/ZfyInfonet/SRP, 2024.
- P. ERDdS and A. R&wi, “On random graphs i,” Publ. math. debrecen, vol. 6, no. 290-297, p. 18, 1959.
Sponsor
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.