ARCANE: Adaptive Routing with Caching and Aware Network Exploration (2407.21625v4)
Abstract: Next-generation datacenters require highly efficient network load balancing to manage the growing scale of AI training and general datacenter traffic. Existing solutions designed for Ethernet, such as Equal Cost Multi-Path (ECMP) and oblivious packet spraying (OPS), struggle to maintain high network utilizations as datacenter topologies (and network failures as a consequence) continue to grow. To address these limitations, we propose ARCANE, a lightweight decentralized per-packet adaptive load balancing algorithm designed to optimize network utilization while ensuring rapid recovery from link failures. ARCANE adapts to network conditions by caching good-performing paths. In case of a network failure, ARCANE re-routes traffic away from it in less than 100 microseconds. ARCANE is designed to be deployed with next-generation out-of-order transports, such as Ultra Ethernet, and introduces less than 25 bytes of per-connection state. We extensively evaluate ARCANE in large-scale simulations and FPGA-based NICs.
- Implementing packet trimming support in hardware. (2022). arXiv:cs.NI/2207.04967
- Data Center TCP (DCTCP). SIGCOMM Comput. Commun. Rev. 40, 4 (aug 2010), 63–74. https://doi.org/10.1145/1851275.1851192
- Infiniband Trade Association. 2024. Supplement to InfiniBand Architecture Specification Volume 1 Release 1.2.1 Annex A17: RoCEv2. (2024).
- Empowering Azure Storage with RDMA. In 20th USENIX Symposium on Networked Systems Design and Implementation (NSDI 23). USENIX Association, Boston, MA, 49–67. https://www.usenix.org/conference/nsdi23/presentation/bai
- SMaRTT-REPS: Sender-based Marked Rapidly-adapting Trimmed & Timed Transport with Recycled Entropies. (2024). arXiv:cs.NI/2404.01630 https://arxiv.org/abs/2404.01630
- Broadcom. 2024. Tomahawk 5 Switch. (2024). https://www.broadcom.com/products/ethernet-connectivity/switching/strataxgs/bcm78900-series (accessed 01/24).
- V. Cerf and R. Kahn. 1974. A Protocol for Packet Network Intercommunication. IEEE Transactions on Communications 22, 5 (1974), 637–648. https://doi.org/10.1109/TCOM.1974.1092259
- Catch the Whole Lot in an Action: Rapid Precise Packet Loss Notification in Data Center. In 11th USENIX Symposium on Networked Systems Design and Implementation (NSDI 14). USENIX Association, Seattle, WA, 17–28. https://www.usenix.org/conference/nsdi14/technical-sessions/presentation/cheng
- Ultra Ethernet Consortium. 2024. Ultra Ethernet. (2024). https://ultraethernet.org/.
- Mitigating Network Noise on Dragonfly Networks through Application-Aware Routing. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC ’19). Association for Computing Machinery, New York, NY, USA, Article 16, 32 pages. https://doi.org/10.1145/3295500.3356196
- An In-Depth Analysis of the Slingshot Interconnect. In SC20: International Conference for High Performance Computing, Networking, Storage and Analysis. 1–14. https://doi.org/10.1109/SC41405.2020.00039
- On the impact of packet spraying in data center networks. In 2013 Proceedings IEEE INFOCOM. 2130–2138. https://doi.org/10.1109/INFCOM.2013.6567015
- S. Floyd and V. Jacobson. 1993. Random early detection gateways for congestion avoidance. IEEE/ACM Transactions on Networking 1, 4 (1993), 397–413. https://doi.org/10.1109/90.251892
- The Addition of Explicit Congestion Notification (ECN) to IP. RFC 3168. (Sept. 2001). https://doi.org/10.17487/RFC3168
- Re-Architecting Datacenter Networks and Stacks for Low Latency and High Performance. In Proceedings of the Conference of the ACM Special Interest Group on Data Communication (SIGCOMM ’17). Association for Computing Machinery, New York, NY, USA, 29–42. https://doi.org/10.1145/3098822.3098825
- Presto: Edge-Based Load Balancing for Fast Datacenter Networks. In Proceedings of the 2015 ACM Conference on Special Interest Group on Data Communication (SIGCOMM ’15). Association for Computing Machinery, New York, NY, USA, 465–478. https://doi.org/10.1145/2785956.2787507
- C. Hopps. 2009. Analysis of an Equal-Cost Multi-Path Algorithm. RFC 2992. (Nov. 2009). https://www.ietf.org/rfc/rfc2992.txt
- FlowBender: Flow-level Adaptive Routing for Improved Latency and Throughput in Datacenter Networks. In Proceedings of the 10th ACM International on Conference on Emerging Networking Experiments and Technologies (CoNEXT ’14). Association for Computing Machinery, New York, NY, USA, 149–160. https://doi.org/10.1145/2674005.2674985
- Swift: Delay is Simple and Effective for Congestion Control in the Datacenter. https://dl.acm.org/doi/pdf/10.1145/3387514.3406591
- DX: Latency-Based Congestion Control for Datacenters. IEEE/ACM Transactions on Networking 25, 1 (2017), 335–348. https://doi.org/10.1109/TNET.2016.2587286
- TIMELY: RTT-based Congestion Control for the Datacenter. In Sigcomm ’15.
- Kathleen Nichols and Van Jacobson. 2012. Controlling Queue Delay: A modern AQM is just one piece of the solution to bufferbloat. Queue 10, 5 (may 2012), 20–34. https://doi.org/10.1145/2208917.2209336
- An edge-queued datagram service for all datacenter traffic. In 19th USENIX Symposium on Networked Systems Design and Implementation (NSDI 22). USENIX Association, Renton, WA, 761–777. https://www.usenix.org/conference/nsdi22/presentation/olteanu
- Alibaba HPN: A Data Center Network for Large Language Model Training. (2024).
- PLB: congestion signals are simple and effective for network load balancing. In Proceedings of the ACM SIGCOMM 2022 Conference (SIGCOMM ’22). Association for Computing Machinery, New York, NY, USA, 207–218. https://doi.org/10.1145/3544216.3544226
- Adaptive Routing in InfiniBand Hardware. In 2022 22nd IEEE International Symposium on Cluster, Cloud and Internet Computing (CCGrid). 463–472. https://doi.org/10.1109/CCGrid54584.2022.00056
- Jupiter Rising: A Decade of Clos Topologies and Centralized Control in Google’s Datacenter Network. In Sigcomm ’15.
- Let It Flow: Resilient Asymmetric Load Balancing with Flowlet Switching. In 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI 17). USENIX Association, Boston, MA, 407–420. https://www.usenix.org/conference/nsdi17/technical-sessions/presentation/vanini
- Poseidon: An Efficient Congestion Control using Deployable INT for Data Center Networks. https://www.usenix.org/system/files/nsdi23-wang-weitao.pdf
- Tuning ECN for Data Center Networks. In ACM CoNEXT’12. ACM. https://www.microsoft.com/en-us/research/publication/tuning-ecn-for-data-center-networks/
- EMPTCP: An ECN Based Approach to Detect Shared Bottleneck in MPTCP. In 2019 28th International Conference on Computer Communication and Networks (ICCCN). 1–10. https://doi.org/10.1109/ICCCN.2019.8847013
- Congestion Control for Large-Scale RDMA Deployments. In SIGCOMM (sigcomm ed.). ACM - Association for Computing Machinery. https://www.microsoft.com/en-us/research/publication/congestion-control-for-large-scale-rdma-deployments/