FASTFLOW: Flexible Adaptive Congestion Control for High-Performance Datacenters (2404.01630v3)

Published 2 Apr 2024 in cs.NI

Abstract: The increasing demand of ML workloads in datacenters places significant stress on current congestion control (CC) algorithms, many of which struggle to maintain performance at scale. These workloads generate bursty, synchronized traffic that requires both rapid response and fairness across flows. Unfortunately, existing CC algorithms that rely heavily on delay as a primary congestion signal often fail to react quickly enough and do not consistently ensure fairness. In this paper, we propose FASTFLOW, a streamlined sender-based CC algorithm that integrates delay, ECN signals, and optional packet trimming to achieve precise, real-time adjustments to congestion windows. Central to FASTFLOW is the QuickAdapt mechanism, which provides accurate bandwidth estimation at the receiver, enabling faster reactions to network conditions. We also show that FASTFLOW can effectively enhance receiver-based algorithms such as EQDS by improving their ability to manage in-network congestion. Our evaluation reveals that FASTFLOW outperforms cutting-edge solutions, including EQDS, Swift, BBR, and MPRDMA, delivering up to 50% performance improvements in modern datacenter networks.

References (66)

Citations (1)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Tweets

https://twitter.com/thoefler/status/1813436214491369772

https://twitter.com/schmiste_ch/status/1846100952954044500

https://twitter.com/USB/status/1785260021493703005

FASTFLOW: Flexible Adaptive Congestion Control for High-Performance Datacenters (2404.01630v3)

Summary

Related Papers

Tweets