MARLIN: Soft Actor-Critic based Reinforcement Learning for Congestion Control in Real Networks (2302.01301v1)
Abstract: Fast and efficient transport protocols are the foundation of an increasingly distributed world. The burden of continuously delivering improved communication performance to support next-generation applications and services, combined with the increasing heterogeneity of systems and network technologies, has promoted the design of Congestion Control (CC) algorithms that perform well under specific environments. The challenge of designing a generic CC algorithm that can adapt to a broad range of scenarios is still an open research question. To tackle this challenge, we propose to apply a novel Reinforcement Learning (RL) approach. Our solution, MARLIN, uses the Soft Actor-Critic algorithm to maximize both entropy and return and models the learning process as an infinite-horizon task. We trained MARLIN on a real network with varying background traffic patterns to overcome the sim-to-real mismatch that researchers have encountered when applying RL to CC. We evaluated our solution on the task of file transfer and compared it to TCP Cubic. While further research is required, results have shown that MARLIN can achieve comparable results to TCP with little hyperparameter tuning, in a task significantly different from its training setting. Therefore, we believe that our work represents a promising first step toward building CC algorithms based on the maximum entropy RL framework.
- Josip Lorincz, Zvonimir Klarin and Julije Ožegović “A Comprehensive Overview of TCP Congestion Control in 5G Networks: Research Challenges and Future Perspectives” In Sensors 21.13, 2021 DOI: 10.3390/s21134510
- “Fairness of Congestion-Based Congestion Control: Experimental Evaluation and Analysis” In arXiv: Networking and Internet Architecture, 2017
- J. Widmer, R. Denda and M. Mauve “A survey on TCP-friendly congestion control” In IEEE Network 15.3, 2001, pp. 28–37 DOI: 10.1109/65.923938
- “Object Detection at the Edge: Off-the-shelf Deep Learning Capable Devices and Accelerators” 2022 International Conference on Military Communication and Information Systems (ICMCIS) In Procedia Computer Science 205, 2022, pp. 239–248 DOI: https://doi.org/10.1016/j.procs.2022.09.025
- “Learning to Walk in Minutes Using Massively Parallel Deep Reinforcement Learning” arXiv, 2021 DOI: 10.48550/ARXIV.2109.11978
- “Learning dexterous in-hand manipulation” In The International Journal of Robotics Research 39.1, 2020, pp. 3–20 DOI: 10.1177/0278364919887447
- “Soft Actor-Critic Algorithms and Applications” arXiv, 2018 DOI: 10.48550/ARXIV.1812.05905
- “Marine Vessel Tracking using a Monocular Camera” In Proceedings of the 2nd International Conference on Deep Learning Theory and Applications - DeLTA, SciTePress, 2021, pp. 17–28 INSTICC DOI: 10.5220/0010516000170028
- Wenting Wei, Huaxi Gu and Baochun Li “Congestion Control: A Renaissance with Machine Learning” In IEEE Network 35 Institute of ElectricalElectronics Engineers Inc., 2021, pp. 262–269 DOI: 10.1109/MNET.011.2000603
- “When machine learning meets congestion control: A survey and comparison” In Computer Networks 192 Elsevier B.V., 2021 DOI: 10.1016/j.comnet.2021.108033
- “Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor” In CoRR abs/1801.01290, 2018 arXiv: http://arxiv.org/abs/1801.01290
- “Time Limits in Reinforcement Learning” arXiv, 2017 DOI: 10.48550/ARXIV.1712.00378
- Martin L. Puterman “Markov Decision Processes: Discrete Stochastic Dynamic Programming” USA: John Wiley & Sons, Inc., 1994
- Richard S. Sutton and Andrew G. Barto “Reinforcement Learning: An Introduction” Cambridge, MA, USA: A Bradford Book, 2018
- “Reinforcement Learning Based Congestion Control in a Real Environment” In 2020 29th International Conference on Computer Communications and Networks (ICCCN), 2020, pp. 1–9 DOI: 10.1109/ICCCN49398.2020.9209750
- “MVFST-RL: An Asynchronous RL Framework for Congestion Control with Delayed Actions”, 2019 URL: http://arxiv.org/abs/1910.04054
- “Park: An Open Platform for Learning-Augmented Computer Systems” In Advances in Neural Information Processing Systems 32 Curran Associates, Inc., 2019 URL: https://proceedings.neurips.cc/paper/2019/file/f69e505b08403ad2298b9f262659929a-Paper.pdf
- “Ubiquiti - EdgeOS” In EdgeOS User Guide Ubiquiti URL: https://dl.ubnt.com/guides/edgemax/EdgeOS_UG.pdf
- Naval Research Laboratory (NRL) PROTocol Engineering Advanced Networking (PROTEAN) Research Group “Multi-Generator (MGEN) Network Test Tool” U.S. Naval Research Laboratory, https://www.nrl.navy.mil/Our-Work/Areas-of-Research/Information-Technology/NCS/MGEN/, 2021
- Antonin Raffin “RL Baselines3 Zoo” In GitHub repository GitHub, https://github.com/DLR-RM/rl-baselines3-zoo, 2020
- “Stable-Baselines3: Reliable Reinforcement Learning Implementations” In Journal of Machine Learning Research 22.268, 2021, pp. 1–8 URL: http://jmlr.org/papers/v22/20-1364.html
- “PyTorch: An Imperative Style, High-Performance Deep Learning Library” In Proceedings of the 33rd International Conference on Neural Information Processing Systems Red Hook, NY, USA: Curran Associates Inc., 2019
- “OpenAI Gym”, 2016 eprint: arXiv:1606.01540
- “A Deep Reinforcement Learning Perspective on Internet Congestion Control” In Proceedings of the 36th International Conference on Machine Learning 97, Proceedings of Machine Learning Research PMLR, 2019, pp. 3050–3059 URL: https://proceedings.mlr.press/v97/jay19a.html
- Martin Thomson Jana Iyengar “QUIC: A UDP-Based Multiplexed and Secure Transport” IETF, Internet Requests for Comments, 2022 URL: https://datatracker.ietf.org/doc/rfc9000/
- “Seamless network migration using the Mockets communications middleware” In 2010 - MILCOM 2010 MILITARY COMMUNICATIONS CONFERENCE, 2010, pp. 2298–2303 DOI: 10.1109/MILCOM.2010.5680364
- “Performance Evaluation of Transport Protocols in Tactical Network Environments” In MILCOM 2019 - 2019 IEEE Military Communications Conference (MILCOM), 2019, pp. 30–36 DOI: 10.1109/MILCOM47813.2019.9021047
- “The NewReno Modification to TCP’s Fast Recovery Algorithm”, Request for Comments 6582 RFC Editor, RFC 6582, 2012 DOI: 10.17487/RFC6582
- Sangtae Ha, Injong Rhee and Lisong Xu “CUBIC: A New TCP-Friendly High-Speed TCP Variant” In SIGOPS Oper. Syst. Rev. 42.5 New York, NY, USA: Association for Computing Machinery, 2008, pp. 64–74 DOI: 10.1145/1400097.1400105
- Lawrence S. Brakmo, Sean W. O’Malley and Larry L. Peterson “TCP Vegas: New Techniques for Congestion Detection and Avoidance” In SIGCOMM, 1994
- “BBR: Congestion-Based Congestion Control” In ACM Queue 14, September-October, 2016, pp. 20–53 URL: http://queue.acm.org/detail.cfm?id=3022184
- Philipp Bruhn, Mirja Kuehlewind and Maciej Muehleisen “Performance and Improvements of TCP CUBIC in Low-Delay Cellular Networks” In 2022 IFIP Networking Conference (IFIP Networking), 2022, pp. 1–9 DOI: 10.23919/IFIPNetworking55013.2022.9829781
- “Resolving poor TCP performance on high-speed long distance links — Overview and comparison of BIC, CUBIC and Hybla” In 2013 IEEE 11th International Symposium on Intelligent Systems and Informatics (SISY), 2013, pp. 325–330 DOI: 10.1109/SISY.2013.6662595
- “TCP BBR in Cloud Networks: Challenges, Analysis, and Solutions” In 2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS), 2021, pp. 943–953 DOI: 10.1109/ICDCS51616.2021.00094
- Kimoon Han, Jae Yong Lee and Byung Chul Kim “Machine-Learning based Loss Discrimination Algorithm for Wireless TCP Congestion Control” In 2019 International Conference on Electronics, Information, and Communication (ICEIC), 2019, pp. 1–2 DOI: 10.23919/ELINFOCOM.2019.8706382
- P. Geurts, I. El Khayat and G. Leduc “A machine learning approach to improve congestion control over wireless computer networks” In Fourth IEEE International Conference on Data Mining (ICDM’04), 2004, pp. 383–386 DOI: 10.1109/ICDM.2004.10063
- A. Jayaraj, T. Venkatesh and C.Siva Ram Murthy “Loss classification in optical burst switching networks using machine learning techniques: improving the performance of TCP” In IEEE Journal on Selected Areas in Communications 26.6, 2008, pp. 45–54 DOI: 10.1109/JSACOCN.2008.033508
- “PCC: Re-Architecting Congestion Control for Consistent High Performance” In Proceedings of the 12th USENIX Conference on Networked Systems Design and Implementation, NSDI’15 Oakland, CA: USENIX Association, 2015, pp. 395–408
- “Pantheon: the training ground for Internet congestion-control research” In 2018 USENIX Annual Technical Conference (USENIX ATC 18) Boston, MA: USENIX Association, 2018, pp. 731–743 URL: https://www.usenix.org/conference/atc18/presentation/yan-francis
- “TCP Ex Machina: Computer-Generated Congestion Control” In SIGCOMM Comput. Commun. Rev. 43.4 New York, NY, USA: Association for Computing Machinery, 2013, pp. 123–134 DOI: 10.1145/2534169.2486020
- “Playing Atari with Deep Reinforcement Learning” In CoRR abs/1312.5602, 2013 arXiv: http://arxiv.org/abs/1312.5602
- “Experience-Driven Congestion Control: When Multi-Path TCP Meets Deep Reinforcement Learning” In IEEE Journal on Selected Areas in Communications 37.6, 2019, pp. 1325–1336 DOI: 10.1109/JSAC.2019.2904358
- “IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures” In CoRR abs/1802.01561, 2018 arXiv: http://arxiv.org/abs/1802.01561
- Raffaele Galliera (6 papers)
- Alessandro Morelli (2 papers)
- Roberto Fronteddu (3 papers)
- Niranjan Suri (9 papers)