Laconic: Streamlined Load Balancers for SmartNICs (2403.11411v1)
Abstract: Load balancers are pervasively used inside today's clouds to scalably distribute network requests across data center servers. Given the extensive use of load balancers and their associated operating costs, several efforts have focused on improving their efficiency by implementing Layer-4 load-balancing logic within the kernel or using hardware acceleration. This work explores whether the more complex and connection-oriented Layer-7 load-balancing capability can also benefit from hardware acceleration. In particular, we target the offloading of load-balancing capability onto programmable SmartNICs. We fully leverage the cost and energy efficiency of SmartNICs using three key ideas. First, we argue that a full and complex TCP/IP stack is not required for Layer-7 load balancers and instead propose a lightweight forwarding agent on the SmartNIC. Second, we develop connection management data structures with a high degree of concurrency with minimal synchronization when executed on multi-core SmartNICs. Finally, we describe how the load-balancing logic could be accelerated using custom packet-processing accelerators on SmartNICs. We prototype Laconic on two types of SmartNIC hardware, achieving over 150 Gbps throughput using all cores on BlueField-2, while a single SmartNIC core achieves 8.7x higher throughput and comparable latency to Nginx on a single x86 core.
- Broadcom PS1100R. https://www.microlandusa.com/broadcom-ps1100r-100gbe-nvme-pcie-storage-adapter-with-power-supply-carrier-card-serial-cable.html.
- ELB vs. ALB vs. NLB: Choosing the Best AWS Load Balancer for Your Needs . https://iamondemand.com/blog/elb-vs-alb-vs-nlb-choosing-the-best-aws-load-balancer-for-your-needs/.
- Fungible can solve the public cloud Hotel California problem. https://blocksandfiles.com/2021/07/19/fungible-can-solve-the-public-cloud-trillion-dollar-paradox-hotel-california-problem/.
- How to cut AWS ELB costs by 90% using Application Load Balancers. . https://medium.com/cognitoiq/how-cognitoiq-are-using-application-load-balancers-to-cut-elastic-load-balancing-cost-by-90-78d4e980624b.
- Knowing How Much to Spend on the AWS Elastic Load Balancer . https://logz.io/blog/cost-management-elb-aws-load-balancer/.
- Reducing data center TCO with server offload strategies. https://www.datacenterdynamics.com/en/opinions/reducing-data-center-tco-with-server-offload-strategies/.
- What is an Application Load Balancer. https://docs.aws.amazon.com/elasticloadbalancing.
- Performance analysis of microservice design patterns. IEEE Internet Computing, 23(6):19–27, 2019.
- Conga: Distributed congestion-aware load balancing for datacenters. In Proceedings of the 2014 ACM Conference on SIGCOMM, SIGCOMM ’14, page 503–514, New York, NY, USA, 2014. Association for Computing Machinery.
- Broadcom. Stingray smartnic adapters and ic. [EB/OL]. https://www.broadcom.com/products/ethernet-connectivity/network-adapters/smartnic Accessed Oct 25, 2020.
- Towards μ𝜇\muitalic_μ s tail latency and terabit ethernet: disaggregating the host network stack. In Proceedings of the ACM SIGCOMM 2022 Conference, pages 767–779, 2022.
- DPDK. [rfc] ethdev: support hairpin queue. [EB/OL]. https://patches.dpdk.org/project/dpdk/patch/[email protected]/ Accessed Oct 20, 2021.
- Dropbox. How we migrated dropbox from nginx to envoy. [EB/OL]. https://web.archive.org/web/20220403133038/https://dropbox.tech/infrastructure/how-we-migrated-dropbox-from-nginx-to-envoy Accessed April 19th, 2022.
- Maglev: A fast and reliable software network load balancer. In 13th {normal-{\{{USENIX}normal-}\}} Symposium on Networked Systems Design and Implementation ({normal-{\{{NSDI}normal-}\}} 16), pages 523–535, 2016.
- Inc F5. Nginx high performace load balancer. [EB/OL]. https://www.nginx.com/ Accessed Oct 25, 2020.
- Facebook. Katran. [EB/OL]. https://github.com/facebookincubator/katran Accessed Oct 25, 2020.
- Rfc2616: Hypertext transfer protocol–http/1.1, 1999.
- Cloud Native Computing Foundation. Envoy proxy-home. [EB/OL]. https://www.envoyproxy.io/ Accessed Oct 25, 2020.
- Micro load balancing in data centers with drill. In Proceedings of the 14th ACM Workshop on Hot Topics in Networks, pages 1–7, 2015.
- Drill: Micro load balancing for low-latency data center networks. In Proceedings of the Conference of the ACM Special Interest Group on Data Communication, pages 225–238, 2017.
- High-speed connection tracking in modern servers. In 2021 IEEE 22nd International Conference on High Performance Switching and Routing (HPSR), pages 1–8. IEEE, 2021.
- Will Glozer. wrk-a http benchmarking tool, 2018.
- Google. Google cloud endpoints now generally available: a fast, scalable api gateway. [EB/OL]. https://cloud.google.com/blog/products/gcp/google-cloud-endpoints-now-ga-a-fast-scalable-api-gateway Accessed Oct 25, 2020.
- Haproxy. Haproxy the reliable, high performance tcp/http load balancer. [EB/OL]. https://www.haproxy.org/ Accessed Oct 25, 2020.
- mtcp: a highly scalable user-level {{\{{TCP}}\}} stack for multicore systems. In 11th {normal-{\{{USENIX}normal-}\}} Symposium on Networked Systems Design and Implementation ({normal-{\{{NSDI}normal-}\}} 14), pages 489–502, 2014.
- What you need to know about (smart) network interface cards. In International Conference on Passive and Active Network Measurement, pages 319–336. Springer, 2021.
- Hula: Scalable load balancing using programmable data planes. In Proceedings of the Symposium on SDN Research, pages 1–12, 2016.
- Tas: Tcp acceleration as an os service. In Proceedings of the Fourteenth EuroSys Conference 2019, pages 1–16, 2019.
- A case for smartnic-accelerated private communication. In 4th Asia-Pacific Workshop on Networking, pages 30–35, 2020.
- Rearchitecting the tcp stack for i/o-offloaded content delivery. In 19th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2022. USENIX, 2023.
- Socksdirect: Datacenter sockets can be fast and compatible. In ACM SIGCOMM Conference (SIGCOMM), August 2019.
- Clicknp: Highly flexible and high performance network processing with reconfigurable hardware. In Proceedings of the 2016 ACM SIGCOMM Conference, SIGCOMM ’16, page 1–14, New York, NY, USA, 2016. Association for Computing Machinery.
- Algorithmic improvements for fast concurrent cuckoo hashing. In Proceedings of the Ninth European Conference on Computer Systems, pages 1–14, 2014.
- Performance characteristics of the bluefield-2 smartnic. arXiv preprint arXiv:2105.06619, 2021.
- Offloading distributed applications onto smartnics using ipipe. In Proceedings of the ACM Special Interest Group on Data Communication, pages 318–333. 2019.
- Snap: A microkernel approach to host networking. In Proceedings of the 27th ACM Symposium on Operating Systems Principles, pages 399–413, 2019.
- Marvel. Octeon tx2 liquidio iii smartnic. [EB/OL]. https://www.marvell.com/products/infrastructure-processors/multi-core-processors/liquidio-smart-nics.html Accessed Oct 25, 2020.
- Silkroad: Making stateful layer-4 load balancing fast and cheap using switching asics. In Proceedings of the Conference of the ACM Special Interest Group on Data Communication, pages 15–28, 2017.
- Acceltcp: Accelerating network applications with stateful TCP offloading. In 17th USENIX Symposium on Networked Systems Design and Implementation (NSDI 20), pages 77–92, Santa Clara, CA, February 2020. USENIX Association.
- Netronome. Agilio cx smartnics. [EB/OL]. https://www.netronome.com/products/agilio-cx/ Accessed Oct 25, 2020.
- Nvidia. Bluefield smartnic ethernet. [EB/OL]. https://www.mellanox.com/products/BlueField-SmartNIC-Ethernet Accessed Oct 25, 2020.
- Stateless datacenter load-balancing with beamer. In 15th {normal-{\{{USENIX}normal-}\}} Symposium on Networked Systems Design and Implementation ({normal-{\{{NSDI}normal-}\}} 18), pages 125–139, 2018.
- Ananta: Cloud scale load balancing. ACM SIGCOMM Computer Communication Review, 43(4):207–218, 2013.
- FlexTOE: Flexible TCP offload with Fine-Grained parallelism. In 19th USENIX Symposium on Networked Systems Design and Implementation (NSDI 22), pages 87–102, Renton, WA, April 2022. USENIX Association.
- Tiara: A scalable and efficient hardware acceleration architecture for stateful layer-4 load balancing. In 19th USENIX Symposium on Networked Systems Design and Implementation (NSDI 22), pages 1345–1358, Renton, WA, April 2022. USENIX Association.
- Resilient datacenter load balancing in the wild. In Proceedings of the Conference of the ACM Special Interest Group on Data Communication, pages 253–266, 2017.
- Noa Zilberman. Technical perspective: hxdp: Light and efficient packet processing offload. Communications of the ACM, 65(8):91–91, 2022.