Towards Fair and Efficient Learning-based Congestion Control (2403.01798v1)
Abstract: Recent years have witnessed a plethora of learning-based solutions for congestion control (CC) that demonstrate better performance over traditional TCP schemes. However, they fail to provide consistently good convergence properties, including {\em fairness}, {\em fast convergence} and {\em stability}, due to the mismatch between their objective functions and these properties. Despite being intuitive, integrating these properties into existing learning-based CC is challenging, because: 1) their training environments are designed for the performance optimization of single flow but incapable of cooperative multi-flow optimization, and 2) there is no directly measurable metric to represent these properties into the training objective function. We present Astraea, a new learning-based congestion control that ensures fast convergence to fairness with stability. At the heart of Astraea is a multi-agent deep reinforcement learning framework that explicitly optimizes these convergence properties during the training process by enabling the learning of interactive policy between multiple competing flows, while maintaining high performance. We further build a faithful multi-flow environment that emulates the competing behaviors of concurrent flows, explicitly expressing convergence properties to enable their optimization during training. We have fully implemented Astraea and our comprehensive experiments show that Astraea can quickly converge to fairness point and exhibit better stability than its counterparts. For example, \sys achieves near-optimal bandwidth sharing (i.e., fairness) when multiple flows compete for the same bottleneck, delivers up to 8.4$\times$ faster convergence speed and 2.8$\times$ smaller throughput deviation, while achieving comparable or even better performance over prior solutions.
- Linux tc. https://man7.org/linux/man-pages/man8/tc.8.html.
- Pantheon tunnel. https://github.com/StanfordSNR/pantheon-tunnel. Accessed: 2021-05-30.
- Classic meets modern: a pragmatic learning-based congestion control for the internet. In Proceedings of the Annual conference of the ACM Special Interest Group on Data Communication on the applications, technologies, architectures, and protocols for computer communication, pages 632–647, 2020.
- Sizing router buffers. ACM SIGCOMM Computer Communication Review, 34(4):281–292, 2004.
- Copa: Practical delay-based congestion control for the internet. In 15th USENIX Symposium on Networked Systems Design and Implementation (NSDI 18), pages 329–342, 2018.
- TCP Vegas: New techniques for congestion detection and avoidance. Number 4. ACM, 1994.
- BBR: Congestion-based congestion control. Queue, 14(5):20–53, 2016.
- Auto: Scaling deep reinforcement learning for datacenter-scale automatic traffic optimization. In Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication, pages 191–205, 2018.
- Credit-scheduled delay-bounded congestion control for datacenters. In Proceedings of the Conference of the ACM Special Interest Group on Data Communication, SIGCOMM ’17, page 239–252. Association for Computing Machinery, 2017.
- PCC: Re-architecting congestion control for consistent high performance. In 12th USENIX Symposium on Networked Systems Design and Implementation (NSDI 15), pages 395–408, 2015.
- PCC Vivace: Online-learning congestion control. In 15th USENIX Symposium on Networked Systems Design and Implementation (NSDI 18), pages 343–356, Renton, WA, April 2018. USENIX Association.
- The newreno modification to tcp’s fast recovery algorithm. 1999.
- Traffic modeling for telecommunications networks. IEEE Communications Magazine, 32(3):70–81, 1994.
- Addressing function approximation error in actor-critic methods. In International Conference on Machine Learning, pages 1587–1596. PMLR, 2018.
- Cubic: a new tcp-friendly high-speed tcp variant. ACM SIGOPS operating systems review, (5):64–74, 2008.
- Actor-attention-critic for multi-agent reinforcement learning. In ICML, 2019.
- Van Jacobson. Congestion avoidance and control. ACM SIGCOMM computer communication review, 18(4):314–329, 1988.
- Throughput fairness index: An explanation. In ATM Forum contribution, volume 99, 1999.
- A deep reinforcement learning perspective on internet congestion control. In International Conference on Machine Learning ICML, pages 3050–3059, 2019.
- Fast tcp: motivation, architecture, algorithms, performance. In IEEE INFOCOM 2004, volume 4, pages 2490–2501. IEEE, 2004.
- Congestion control for high bandwidth-delay product networks. In Proceedings of the 2002 conference on Applications, technologies, architectures, and protocols for computer communications, pages 89–102, 2002.
- Rate control for communication networks: shadow prices, proportional fairness and stability. Journal of the Operational Research society, 49(3):237–252, 1998.
- Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971, 2015.
- Michael L Littman. Markov games as a framework for multi-agent reinforcement learning. In Machine learning proceedings 1994, pages 157–163. Elsevier, 1994.
- Multi-agent actor-critic for mixed cooperative-competitive environments. arXiv preprint arXiv:1706.02275, 2017.
- Multi-objective congestion control. In Proceedings of the Seventeenth European Conference on Computer Systems, pages 218–235, 2022.
- Neural adaptive video streaming with pensieve. In Proceedings of the Conference of the ACM Special Interest Group on Data Communication, pages 197–210, 2017.
- Learning scheduling algorithms for data processing clusters. In Proceedings of the ACM Special Interest Group on Data Communication, SIGCOMM ’19, page 270–288, New York, NY, USA, 2019. Association for Computing Machinery.
- Tcp libra: Exploring rtt-fairness for tcp. In International Conference on Research in Networking, pages 1005–1013. Springer, 2007.
- Asynchronous methods for deep reinforcement learning. In International conference on machine learning, pages 1928–1937. PMLR, 2016.
- Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602, 2013.
- Mahimahi: Accurate record-and-replay for HTTP. In 2015 USENIX Annual Technical Conference (USENIX ATC 15), pages 417–429, 2015.
- Qmix: Monotonic value function factorisation for deep multi-agent reinforcement learning. ArXiv, abs/1803.11485, 2018.
- Monotonic value function factorisation for deep multi-agent reinforcement learning. J. Mach. Learn. Res., 21:178:1–178:51, 2020.
- Owl: congestion control with partially invisible networks via reinforcement learning. In IEEE INFOCOM 2021-IEEE Conference on Computer Communications, pages 1–10. IEEE, 2021.
- An experimental study of the learnability of congestion control. In Proceedings of the 2014 ACM Conference on SIGCOMM, SIGCOMM ’14, page 479–490, New York, NY, USA, 2014. Association for Computing Machinery.
- Reinforcement learning: An introduction. MIT press, 2018.
- A compound tcp approach for high-speed and long distance networks. In Proceedings IEEE INFOCOM 2006. 25TH IEEE International Conference on Computer Communications, pages 1–12. IEEE, 2006.
- Spine: an efficient drl-based congestion control with ultra-low overhead. In Proceedings of the 18th International Conference on emerging Networking EXperiments and Technologies, pages 261–275, 2022.
- Tcp ex machina: Computer-generated congestion control. In Proceedings of the ACM SIGCOMM 2013 Conference on SIGCOMM, SIGCOMM ’13, page 123–134, New York, NY, USA, 2013. Association for Computing Machinery.
- Stochastic forecasts achieve high throughput and low delay over cellular networks. In 10th USENIX Symposium on Networked Systems Design and Implementation (NSDI 13), pages 459–471, 2013.
- Genet: Automatic curriculum generation for learning adaptation in networking. In Proceedings of the ACM SIGCOMM 2022 Conference, SIGCOMM ’22, page 397–413, 2022.
- Tacc: A full-stack cloud computing infrastructure for machine learning tasks. arXiv preprint arXiv:2110.01556, 2021.
- Pantheon: the training ground for internet congestion-control research. In 2018 USENIX Annual Technical Conference (USENIX ATC 18), 2018.
- Liteflow: towards high-performance adaptive neural networks for kernel datapath. In Proceedings of the ACM SIGCOMM 2022 Conference, pages 414–427, 2022.