Scaling Data Center TCP to Terabits with Laminar (2504.19058v1)
Abstract: Laminar is the first TCP stack designed for the reconfigurable match-action table (RMT) architecture, widely used in high-speed programmable switches and SmartNICs. Laminar reimagines TCP processing as a pipeline of simple match-action operations, enabling line-rate performance with low latency and minimal energy consumption, while maintaining compatibility with standard TCP and POSIX sockets. Leveraging novel techniques like optimistic concurrency, pseudo segment updates, and bump-in-the-wire processing, Laminar handles the transport logic, including retransmission, reassembly, flow, and congestion control, entirely within the RMT pipeline. We prototype Laminar on an Intel Tofino2 switch and demonstrate its scalability to terabit speeds, its flexibility, and robustness to network dynamics. Laminar reaches an unprecedented 25M pkts/sec with a single host core for streaming workloads, enough to exceed 1.6Tbps with 8K MTU. Laminar delivers RDMA-equivalent performance, saving up to 16 host CPU cores versus the TAS kernel-bypass TCP stack with short RPC workloads, while achieving 1.3$\times$ higher peak throughput at 5$\times$ lower 99.99p tail latency. A key-value store on Laminar doubles the throughput-per-watt versus TAS. Demonstrating Laminar's flexibility, we implement TCP stack extensions, including a sequencer API for a linearizable distributed shared log, a new congestion control protocol, and delayed ACKs.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Collections
Sign up for free to add this paper to one or more collections.