Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
144 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Tail-Learning: Adaptive Learning Method for Mitigating Tail Latency in Autonomous Edge Systems (2312.16883v1)

Published 28 Dec 2023 in cs.DC

Abstract: In the realm of edge computing, the increasing demand for high Quality of Service (QoS), particularly in dynamic multimedia streaming applications (e.g., Augmented Reality/Virtual Reality and online gaming), has prompted the need for effective solutions. Nevertheless, adopting an edge paradigm grounded in distributed computing has exacerbated the issue of tail latency. Given a limited variety of multimedia services supported by edge servers and the dynamic nature of user requests, employing traditional queuing methods to model tail latency in distributed edge computing is challenging, substantially exacerbating head-of-line (HoL) blocking. In response to this challenge, we have developed a learning-based scheduling method to mitigate the overall tail latency, which adaptively selects appropriate edge servers for execution as incoming distributed tasks vary with unknown size. To optimize the utilization of the edge computing paradigm, we leverage Laplace transform techniques to theoretically derive an upper bound for the response time of edge servers. Subsequently, we integrate this upper bound into reinforcement learning to facilitate tail learning and enable informed decisions for autonomous distributed scheduling. The experiment results demonstrate the efficiency in reducing tail latency compared to existing methods.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (53)
  1. Taming tail latency for erasure-coded, distributee storage systems. In IEEE INFOCOM 2017-IEEE Conference on Computer Communications. IEEE, 1–9.
  2. Efficient multi-player computation offloading for VR edge-cloud computing systems. Applied Sciences 10, 16 (2020), 5515.
  3. Deep learning. Vol. 1. MIT press Cambridge, MA, USA.
  4. Mathieu Bouet and Vania Conan. 2018. Mobile edge computing resources optimization: A geo-clustering approach. IEEE Transactions on Network and Service Management 15, 2 (2018), 787–796.
  5. Stephen P Boyd and Lieven Vandenberghe. 2004. Convex optimization. Cambridge university press.
  6. Distributed control of robotic networks: a mathematical approach to motion coordination algorithms. Vol. 27. Princeton University Press.
  7. Collaborative service placement for edge computing in dense small cell networks. IEEE Transactions on Mobile Computing 20, 2 (2019), 377–390.
  8. Achieving low tail-latency and high scalability for serializable transactions in edge computing. In Proceedings of the Sixteenth European Conference on Computer Systems. 210–227.
  9. Jeffrey Dean and Luiz André Barroso. 2013. The tail at scale. Commun. ACM 56, 2 (2013), 74–80.
  10. Service Capacity Enhanced Task Offloading and Resource Allocation in Multi-Server Edge Computing Environment. In 2019 IEEE International Conference on Web Services (ICWS). 83–90. https://doi.org/10.1109/ICWS.2019.00025
  11. Rick Durrett. 2019. Probability: theory and examples. Vol. 49. Cambridge university press.
  12. Mor Harchol-Balter. 2013. Performance modeling and design of computer systems: queueing theory in action. Cambridge University Press.
  13. Cloud risk management with OWA-LSTM and fuzzy linguistic decision making. IEEE Transactions on Fuzzy Systems 30, 11 (2022), 4657–4666.
  14. Optimizing the performance-cost tradeoff in cross-edge analytics. In 2018 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computing, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI). IEEE, 564–571.
  15. Feature engineering for predictive modeling using reinforcement learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 32.
  16. Leonard Kleinrock. 1975. Queueing systems: theory. John Wiley.
  17. Martin Kleppmann. 2017. Designing data-intensive applications: The big ideas behind reliable, scalable, and maintainable systems. ”O’Reilly Media, Inc.”.
  18. The fast and the frugal: Tail latency aware provisioning for coping with load variations. In Proceedings of The Web Conference 2020. 314–326.
  19. Michael Austin Langford and Betty HC Cheng. 2021. Enki: a diversity-driven approach to test and train robust learning-enabled systems. ACM Transactions on Autonomous and Adaptive Systems (TAAS) 15, 2 (2021), 1–32.
  20. Efficient and error-bounded spatiotemporal quantile monitoring in edge computing environments. In 48th International Conference on Very Large Data Bases, VLDB 2022. Association for Computing Machinery, 1753–1765.
  21. Model-driven cluster resource management for AI workloads in edge clouds. ACM Transactions on Autonomous and Adaptive Systems 18, 1 (2023), 1–26.
  22. Jianhui Liu and Qi Zhang. 2019. Reliability and latency aware code-partitioning offloading in mobile edge computing. In 2019 IEEE Wireless Communications and Networking Conference (WCNC). IEEE, 1–7.
  23. Multiobjective optimization for computation offloading in fog computing. IEEE Internet of Things Journal 5, 1 (2017), 283–294.
  24. A stepwise auto-profiling method for performance optimization of streaming applications. ACM Transactions on Autonomous and Adaptive Systems (TAAS) 12, 4 (2017), 1–33.
  25. Multi-tier cloudvr: Leveraging edge computing in remote rendered virtual reality. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 17, 2 (2021), 1–24.
  26. Express-lane scheduling and multithreading to minimize the tail latency of microservices. In 2019 IEEE International Conference on Autonomic Computing (ICAC). IEEE, 194–199.
  27. Q-zilla: A scheduling framework and core microarchitecture for tail-tolerant microservices. In 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, 207–219.
  28. Michael Mitzenmacher and Eli Upfal. 2017. Probability and computing: Randomization and probabilistic techniques in algorithms and data analysis. Cambridge university press.
  29. Learning in Cooperative Multiagent Systems Using Cognitive and Machine Models. ACM Transactions on Autonomous and Adaptive Systems 18, 4 (2023), 1–22.
  30. Twig: Multi-agent task management for colocated latency-critical cloud services. In 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, 167–179.
  31. Alexander Noé. 2007. Matroska file format (under construction!). Retrieved from the Internet: URL: http://web. archive. orgweb/20070821155146/www. matroska. org/technical/specs/matroska. pdf [retrieved on Jan. 19, 2011] 51 (2007).
  32. Katsuhiko Ogata. 2010. Modern control engineering fifth edition.
  33. EdgeMart: A Sustainable Networked OTT Economy on the Wireless Edge for Saving Multimedia IP Bandwidth. ACM Transactions on Autonomous and Adaptive Systems (2023).
  34. Queue-learning: A reinforcement learning approach for providing quality of service. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 461–468.
  35. Joy Rahman and Palden Lama. 2019. Predicting the end-to-end tail latency of containerized microservices in the cloud. In 2019 IEEE International Conference on Cloud Engineering (IC2E). IEEE, 200–210.
  36. Richard S Sutton and Andrew G Barto. 2018. Reinforcement learning: An introduction. MIT press.
  37. UAV-aided edge/fog computing in smart IoT community for social augmented reality. IEEE Internet of Things Journal 7, 6 (2020), 4872–4884.
  38. Action branching architectures for deep reinforcement learning. In Proceedings of the aaai conference on artificial intelligence, Vol. 32.
  39. Performance study of mixed reality for edge computing. In Proceedings of the 12th IEEE/ACM International Conference on Utility and Cloud Computing. 285–294.
  40. Effective capacity-based resource allocation in mobile edge computing with two-stage tandem queues. IEEE Transactions on Communications 67, 9 (2019), 6221–6233.
  41. Ronald J Williams. 1992. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine learning 8 (1992), 229–256.
  42. Haibing Wu and Xiaodong Gu. 2015. Towards dropout training for convolutional neural networks. Neural Networks 71 (2015), 1–10.
  43. Joint latency and cost optimization for erasurecoded data center storage. ACM SIGMETRICS Performance Evaluation Review 42, 2 (2014), 3–14.
  44. Cutting long-tail latency of routing response in software defined networks. IEEE Journal on Selected Areas in Communications 36, 3 (2018), 384–396.
  45. QoS-Aware Scheduling of Remote Rendering for Interactive Multimedia Applications in Edge Computing. IEEE Transactions on Parallel and Distributed Systems 33, 12 (2022), 3816–3832.
  46. Task scheduling in mobile edge computing with stochastic requests and m/m/1 servers. In 2019 IEEE 21st International Conference on High Performance Computing and Communications; IEEE 17th International Conference on Smart City; IEEE 5th International Conference on Data Science and Systems (HPCC/SmartCity/DSS). IEEE, 2379–2382.
  47. TODG: Distributed task offloading with delay guarantees for edge computing. IEEE Transactions on Parallel and Distributed Systems 33, 7 (2021), 1650–1665.
  48. A Density-Based Offloading Strategy for IoT Devices in Edge Computing Systems. IEEE Access 6 (2018), 73520–73530. https://doi.org/10.1109/ACCESS.2018.2882452
  49. Rendering multi-party mobile augmented reality from edge. In Proceedings of the 29th ACM Workshop on Network and Operating Systems Support for Digital Audio and Video. 67–72.
  50. AggCast: Practical Cost-effective Scheduling for Large-scale Cloud-edge Crowdsourced Live Streaming. In Proceedings of the 30th ACM International Conference on Multimedia. 3026–3034.
  51. Distributed Redundant Placement for Microservice-based Applications at the Edge. IEEE Transactions on Services Computing 15, 3 (2022), 1732–1745. https://doi.org/10.1109/TSC.2020.3013600
  52. A Mobility-Aware Cross-Edge Computation Offloading Framework for Partitionable Applications. In 2019 IEEE International Conference on Web Services (ICWS). 193–200. https://doi.org/10.1109/ICWS.2019.00041
  53. QoE Aware and Cell Capacity Enhanced Computation Offloading for Multi-Server Mobile Edge Computing Systems with Energy Harvesting Devices. In 2018 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computing, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI). 671–678. https://doi.org/10.1109/SmartWorld.2018.00133

Summary

We haven't generated a summary for this paper yet.