Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

FASTFLOW: Flexible Adaptive Congestion Control for High-Performance Datacenters (2404.01630v3)

Published 2 Apr 2024 in cs.NI

Abstract: The increasing demand of ML workloads in datacenters places significant stress on current congestion control (CC) algorithms, many of which struggle to maintain performance at scale. These workloads generate bursty, synchronized traffic that requires both rapid response and fairness across flows. Unfortunately, existing CC algorithms that rely heavily on delay as a primary congestion signal often fail to react quickly enough and do not consistently ensure fairness. In this paper, we propose FASTFLOW, a streamlined sender-based CC algorithm that integrates delay, ECN signals, and optional packet trimming to achieve precise, real-time adjustments to congestion windows. Central to FASTFLOW is the QuickAdapt mechanism, which provides accurate bandwidth estimation at the receiver, enabling faster reactions to network conditions. We also show that FASTFLOW can effectively enhance receiver-based algorithms such as EQDS by improving their ability to manage in-network congestion. Our evaluation reveals that FASTFLOW outperforms cutting-edge solutions, including EQDS, Swift, BBR, and MPRDMA, delivering up to 50% performance improvements in modern datacenter networks.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (66)
  1. Implementing packet trimming support in hardware. (2022). arXiv:cs.NI/2207.04967
  2. CONGA: Distributed Congestion-Aware Load Balancing for Datacenters. In Proceedings of the 2014 ACM Conference on SIGCOMM (SIGCOMM ’14). Association for Computing Machinery, New York, NY, USA, 503–514. https://doi.org/10.1145/2619239.2626316
  3. Data Center TCP (DCTCP). SIGCOMM Comput. Commun. Rev. 40, 4 (aug 2010), 63–74. https://doi.org/10.1145/1851275.1851192
  4. Data Center TCP (DCTCP). In Proceedings of the ACM SIGCOMM 2010 Conference (SIGCOMM ’10). Association for Computing Machinery, New York, NY, USA, 63–74. https://doi.org/10.1145/1851182.1851192
  5. Less Is More: Trading a Little Bandwidth for Ultra-Low Latency in the Data Center. In 9th USENIX Symposium on Networked Systems Design and Implementation (NSDI 12). USENIX Association, San Jose, CA, 253–266. https://www.usenix.org/conference/nsdi12/technical-sessions/presentation/alizadeh
  6. Bolt: Sub-RTT Congestion Control for Ultra-Low Latency. In 20th USENIX Symposium on Networked Systems Design and Implementation (NSDI 23). USENIX Association, Boston, MA, 219–236. https://www.usenix.org/conference/nsdi23/presentation/arslan
  7. Empowering Azure Storage with RDMA. In 20th USENIX Symposium on Networked Systems Design and Implementation (NSDI 23). USENIX Association, Boston, MA, 49–67. https://www.usenix.org/conference/nsdi23/presentation/bai
  8. Maciej Besta and Torsten Hoefler. 2014. Slim Fly: A Cost Effective Low-Diameter Network Topology. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC ’14). IEEE Press, 348–359. https://doi.org/10.1109/SC.2014.34
  9. Broadcom. 2024a. Deploying AI/ML training clusters with IP/Ethernet. (2024). https://www.broadcom.com/blog/deploying-ai-ml-training-clusters-with-ip-ethernet (accessed 01/24).
  10. Broadcom. 2024b. Tomahawk 5 Switch. (2024). https://www.broadcom.com/products/ethernet-connectivity/switching/strataxgs/bcm78900-series (accessed 01/24).
  11. Per-Packet Load-Balanced, Low-Latency Routing for Clos-Based Data Center Networks. In Proceedings of the Ninth ACM Conference on Emerging Networking Experiments and Technologies (CoNEXT ’13). Association for Computing Machinery, New York, NY, USA, 49–60. https://doi.org/10.1145/2535372.2535375
  12. BBR: Congestion-Based Congestion Control. Commun. ACM 60 (2017), 58–66. http://cacm.acm.org/magazines/2017/2/212428-bbr-congestion-based-congestion-control/fulltext
  13. V. Cerf and R. Kahn. 1974. A Protocol for Packet Network Intercommunication. IEEE Transactions on Communications 22, 5 (1974), 637–648. https://doi.org/10.1109/TCOM.1974.1092259
  14. Understanding TCP Incast Throughput Collapse in Datacenter Networks. In Proceedings of the 1st ACM Workshop on Research on Enterprise Networking (WREN ’09). Association for Computing Machinery, New York, NY, USA, 73–82. https://doi.org/10.1145/1592681.1592693
  15. Catch the Whole Lot in an Action: Rapid Precise Packet Loss Notification in Data Center. In 11th USENIX Symposium on Networked Systems Design and Implementation (NSDI 14). USENIX Association, Seattle, WA, 17–28. https://www.usenix.org/conference/nsdi14/technical-sessions/presentation/cheng
  16. D. Hernandez D. Amodei. 2018. The Computational Limits of Deep Learning. (2018). https://openai.com/research/ai-and-compute (accessed 9/23).
  17. Noise in the Clouds: Influence of Network Performance Variability on Application Scalability. Proc. ACM Meas. Anal. Comput. Syst. 6, 3, Article 49 (Dec. 2022), 27 pages. https://doi.org/10.1145/3570609 arXiv:2210.15315
  18. Mitigating Network Noise on Dragonfly Networks through Application-Aware Routing. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC ’19). Association for Computing Machinery, New York, NY, USA, Article 16, 32 pages. https://doi.org/10.1145/3295500.3356196
  19. An In-Depth Analysis of the Slingshot Interconnect. In SC20: International Conference for High Performance Computing, Networking, Storage and Analysis. 1–14. https://doi.org/10.1109/SC41405.2020.00039
  20. Jeffrey Dean and Luiz André Barroso. 2013. The Tail at Scale. Commun. ACM 56, 2 (feb 2013), 74–80. https://doi.org/10.1145/2408776.2408794
  21. On the impact of packet spraying in data center networks. In 2013 Proceedings IEEE INFOCOM. 2130–2138. https://doi.org/10.1109/INFCOM.2013.6567015
  22. S. Floyd and V. Jacobson. 1993. Random early detection gateways for congestion avoidance. IEEE/ACM Transactions on Networking 1, 4 (1993), 397–413. https://doi.org/10.1109/90.251892
  23. The Addition of Explicit Congestion Notification (ECN) to IP. RFC 3168. (Sept. 2001). https://doi.org/10.17487/RFC3168
  24. PHost: Distributed near-Optimal Datacenter Transport over Commodity Network Fabric. In Proceedings of the 11th ACM Conference on Emerging Networking Experiments and Technologies (CoNEXT ’15). Association for Computing Machinery, New York, NY, USA, Article 1, 12 pages. https://doi.org/10.1145/2716281.2836086
  25. DRILL: Micro Load Balancing for Low-Latency Data Center Networks. In Proceedings of the Conference of the ACM Special Interest Group on Data Communication (SIGCOMM ’17). Association for Computing Machinery, New York, NY, USA, 225–238. https://doi.org/10.1145/3098822.3098839
  26. Aquila: A unified, low-latency fabric for datacenter networks. In 19th USENIX Symposium on Networked Systems Design and Implementation (NSDI 22). USENIX Association, Renton, WA, 1249–1266. https://www.usenix.org/conference/nsdi22/presentation/gibson
  27. Backpressure Flow Control. In 19th USENIX Symposium on Networked Systems Design and Implementation (NSDI 22). USENIX Association, Renton, WA, 779–805. https://www.usenix.org/conference/nsdi22/presentation/goyal
  28. BCube: A High Performance, Server-Centric Network Architecture for Modular Data Centers. In Proceedings of the ACM SIGCOMM 2009 Conference on Data Communication (SIGCOMM ’09). Association for Computing Machinery, New York, NY, USA, 63–74. https://doi.org/10.1145/1592568.1592577
  29. Re-Architecting Datacenter Networks and Stacks for Low Latency and High Performance. In Proceedings of the Conference of the ACM Special Interest Group on Data Communication (SIGCOMM ’17). Association for Computing Machinery, New York, NY, USA, 29–42. https://doi.org/10.1145/3098822.3098825
  30. HammingMesh: A Network Topology for Large-Scale Deep Learning. In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (SC ’22). IEEE Press, Article 11, 18 pages.
  31. Data Center Ethernet and Remote Direct Memory Access: Issues at Hyperscale. Computer 56, 7 (2023), 67–77. https://doi.org/10.1109/MC.2023.3261184
  32. The Effect of Network Noise on Large-Scale Collective Communications. Parallel Processing Letters (PPL) 19, 4 (Aug. 2009), 573–593.
  33. C. Hopps. 2009. Analysis of an Equal-Cost Multi-Path Algorithm. RFC 2992. (Nov. 2009). https://www.ietf.org/rfc/rfc2992.txt
  34. Network Endpoint Congestion Control for Fine-Grained Communication. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC ’15). Association for Computing Machinery, New York, NY, USA, Article 35, 12 pages. https://doi.org/10.1145/2807591.2807600
  35. FlowBender: Flow-level Adaptive Routing for Improved Latency and Throughput in Datacenter Networks. In Proceedings of the 10th ACM International on Conference on Emerging Networking Experiments and Technologies (CoNEXT ’14). Association for Computing Machinery, New York, NY, USA, 149–160. https://doi.org/10.1145/2674005.2674985
  36. Technology-Driven, Highly-Scalable Dragonfly Topology. In 2008 International Symposium on Computer Architecture. 77–88. https://doi.org/10.1109/ISCA.2008.19
  37. Swift: Delay is Simple and Effective for Congestion Control in the Datacenter. https://dl.acm.org/doi/pdf/10.1145/3387514.3406591
  38. DX: Latency-Based Congestion Control for Datacenters. IEEE/ACM Transactions on Networking 25, 1 (2017), 335–348. https://doi.org/10.1109/TNET.2016.2587286
  39. HPCC: High Precision Congestion Control. In Proceedings of the ACM Special Interest Group on Data Communication (SIGCOMM ’19). Association for Computing Machinery, New York, NY, USA, 44–58. https://doi.org/10.1145/3341302.3342085
  40. Multi-path transport for RDMA in datacenters. In Proceedings of the 15th USENIX Conference on Networked Systems Design and Implementation (NSDI’18). USENIX Association, USA, 357–371.
  41. TIMELY: RTT-based Congestion Control for the Datacenter. In Sigcomm ’15.
  42. Revisiting Network Support for RDMA. In Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication (SIGCOMM ’18). Association for Computing Machinery, New York, NY, USA, 313–326. https://doi.org/10.1145/3230543.3230557
  43. Homa: A Receiver-Driven Low-Latency Transport Protocol Using Network Priorities. In Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication (SIGCOMM ’18). Association for Computing Machinery, New York, NY, USA, 221–235. https://doi.org/10.1145/3230543.3230564
  44. Kathleen Nichols and Van Jacobson. 2012. Controlling Queue Delay: A modern AQM is just one piece of the solution to bufferbloat. Queue 10, 5 (may 2012), 20–34. https://doi.org/10.1145/2208917.2209336
  45. Nvidia. 2024. Networking for the Era of AI: The Network Defines the Data Center. (2024). https://nvdam.widen.net/s/bvpmlkbgzt/networking-overall-whitepaper-networking-for-ai-2911204 (accessed 01/24).
  46. An edge-queued datagram service for all datacenter traffic. In 19th USENIX Symposium on Networked Systems Design and Implementation (NSDI 22). USENIX Association, Renton, WA, 761–777. https://www.usenix.org/conference/nsdi22/presentation/olteanu
  47. Jupiter Evolving: Transforming Google’s Datacenter Network via Optical Circuit Switches and Software-Defined Networking. In Proceedings of ACM SIGCOMM 2022.
  48. PLB: congestion signals are simple and effective for network load balancing. In Proceedings of the ACM SIGCOMM 2022 Conference (SIGCOMM ’22). Association for Computing Machinery, New York, NY, USA, 207–218. https://doi.org/10.1145/3544216.3544226
  49. Congestion control in machine learning clusters. In Proceedings of the 21st ACM Workshop on Hot Topics in Networks (HotNets ’22). Association for Computing Machinery, New York, NY, USA, 235–242. https://doi.org/10.1145/3563766.3564115
  50. Adaptive Routing in InfiniBand Hardware. In 2022 22nd IEEE International Symposium on Cluster, Cloud and Internet Computing (CCGrid). 463–472. https://doi.org/10.1109/CCGrid54584.2022.00056
  51. Inside the Social Network’s (Datacenter) Network. SIGCOMM Comput. Commun. Rev. 45, 4 (aug 2015), 123–137. https://doi.org/10.1145/2829988.2787472
  52. HINT: Supporting Congestion Control Decisions with P4-driven In-Band Network Telemetry. In 2023 IEEE 24th International Conference on High Performance Switching and Routing (HPSR). 83–88. https://doi.org/10.1109/HPSR57248.2023.10147977
  53. Mitigating Network Noise on Dragonfly Networks through Application-Aware Routing. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC19).
  54. A Cloud-Optimized Transport Protocol for Elastic and Scalable HPC. IEEE Micro 40, 6 (2020), 67–73. https://doi.org/10.1109/MM.2020.3016891
  55. Jupiter Rising: A Decade of Clos Topologies and Centralized Control in Google’s Datacenter Network. In Sigcomm ’15.
  56. Surviving switch failures in cloud datacenters. SIGCOMM Comput. Commun. Rev. 51, 2 (may 2021), 2–9. https://doi.org/10.1145/3464994.3464996
  57. RoCC: Robust Congestion Control for RDMA. In Proceedings of the 16th International Conference on Emerging Networking EXperiments and Technologies (CoNEXT ’20). Association for Computing Machinery, New York, NY, USA, 17–30. https://doi.org/10.1145/3386367.3431316
  58. The Computational Limits of Deep Learning. (2022). arXiv:cs.LG/2007.05558
  59. Deadline-Aware Datacenter Tcp (D2TCP). In Proceedings of the ACM SIGCOMM 2012 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication (SIGCOMM ’12). Association for Computing Machinery, New York, NY, USA, 115–126. https://doi.org/10.1145/2342356.2342388
  60. Congestion Control Using In-Network Telemetry for Lossless Datacenters. Computers, Materials & Continua 75, 1 (2023), 1195–1212. https://doi.org/10.32604/cmc.2023.035932
  61. Tuning ECN for Data Center Networks. In ACM CoNEXT’12. ACM. https://www.microsoft.com/en-us/research/publication/tuning-ecn-for-data-center-networks/
  62. EMPTCP: An ECN Based Approach to Detect Shared Bottleneck in MPTCP. In 2019 28th International Conference on Computer Communication and Networks (ICCCN). 1–10. https://doi.org/10.1109/ICCCN.2019.8847013
  63. High-Resolution Measurement of Data Center Microbursts. In Proceedings of the 2017 Internet Measurement Conference (IMC ’17). Association for Computing Machinery, New York, NY, USA, 78–85. https://doi.org/10.1145/3131365.3131375
  64. PACC: Proactive and Accurate Congestion Feedback for RDMA Congestion Control. In IEEE INFOCOM 2022 - IEEE Conference on Computer Communications. 2228–2237. https://doi.org/10.1109/INFOCOM48880.2022.9796803
  65. ExpressPass++: Credit-Effecient Congestion Control for Data Centers. In 2019 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom). 46–52. https://doi.org/10.1109/ISPA-BDCloud-SustainCom-SocialCom48970.2019.00018
  66. Congestion Control for Large-Scale RDMA Deployments. In SIGCOMM (sigcomm ed.). ACM - Association for Computing Machinery. https://www.microsoft.com/en-us/research/publication/congestion-control-for-large-scale-rdma-deployments/
Citations (1)

Summary

We haven't generated a summary for this paper yet.