Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
162 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Datacenter RPCs can be General and Fast (1806.00680v2)

Published 2 Jun 2018 in cs.OS

Abstract: It is commonly believed that datacenter networking software must sacrifice generality to attain high performance. The popularity of specialized distributed systems designed specifically for niche technologies such as RDMA, lossless networks, FPGAs, and programmable switches testifies to this belief. In this paper, we show that such specialization is not necessary. eRPC is a new general-purpose remote procedure call (RPC) library that offers performance comparable to specialized systems, while running on commodity CPUs in traditional datacenter networks based on either lossy Ethernet or lossless fabrics. eRPC performs well in three key metrics: message rate for small messages; bandwidth for large messages; and scalability to a large number of nodes and CPU cores. It handles packet loss, congestion, and background request execution. In microbenchmarks, one CPU core can handle up to 10 million small RPCs per second, or send large messages at 75 Gbps. We port a production-grade implementation of Raft state machine replication to eRPC without modifying the core Raft source code. We achieve 5.5 microseconds of replication latency on lossy Ethernet, which is faster than or comparable to specialized replication systems that use programmable switches, FPGAs, or RDMA.

Citations (284)

Summary

  • The paper introduces eRPC, a lightweight RPC library that delivers up to 10 million small RPCs per second using commodity datacenter networks.
  • The study demonstrates eRPC's scalability and efficiency with benchmarks showing 5.5µs replication latency and 75 Gbps throughput on lossy Ethernet.
  • The findings challenge the need for specialized hardware by proving that optimized host-based solutions can achieve high performance at lower cost.

An Expert Analysis of "Datacenter RPCs can be General and Fast"

The paper “Datacenter RPCs can be General and Fast” challenges the prevailing assumption that datacenter networking software must trade off generality for performance. The authors introduce eRPC, a general-purpose Remote Procedure Call (RPC) library designed to achieve state-of-the-art performance on commodity datacenter networks without the need for specialized network hardware such as RDMA, lossless networks, or programmable switches. They address the debate between the necessity of additional in-network functionalities versus end-to-end solutions for datacenter applications.

eRPC Design and Performance

eRPC is engineered to deliver high message rates for small messages, utilize high bandwidth for large messages, and remain scalable across numerous nodes and CPU cores. It is capable of handling packet loss, congestion, and executing long-running background requests. The library's performance metrics are impressive; it can handle up to 10 million small RPCs per second or transmit large messages at speeds of 75 Gbps using a single CPU core. These benchmarks are achieved over lossy Ethernet without Priority Flow Control (PFC).

The authors successfully integrated eRPC into existing systems, notably porting a production-grade implementation of the Raft state machine replication to eRPC without altering the core Raft source code. This integration yielded a replication latency of 5.5µs on lossy Ethernet, demonstrating competitive performance with specialized systems utilizing programmable switches, FPGAs, or RDMA.

Technical Contributions and Results

The paper presents several significant contributions, notably:

  1. Performance Optimization: eRPC incorporates common-case optimizations, zero-copy transmission, and a scalable implementation with a NIC memory footprint that is independent of the node cluster size. These techniques can enhance eRPC's performance for target workloads by up to 66%.
  2. Lossy Network Performance: It provides evidence that state-of-the-art networking performance can be attained in a 100-node cluster with lossy Ethernet without the need for lossless fabrics. Microbenchmarks indicate eRPC achieves 2.3µs median RPC latency and handles high RPC rates and large message transfers effectively.
  3. Integration as a Drop-in Library: The ability to use eRPC with unmodified existing software underscores its potential as a drop-in high-performance networking solution, as demonstrated with the integration into Raft for replicated in-memory key-value storage.

Implications and Future Speculations

The results defy conventional predictions that specialized hardware and co-design are prerequisites for high performance in datacenter networks. By optimizing for common use cases and leveraging modern NIC capabilities, eRPC upends the prevailing view that lossless networking or hardware support is essential for optimal performance. This paradigm shift opens several implications:

  • Practical Implications: eRPC reduces reliance on niche hardware, thereby lowering infrastructure costs and simplifying system designs. Its implementation offers a viable alternative to specialized solutions by delivering competitive performance without sacrificing network abstraction.
  • Theoretical Implications: The paper adds important insights into RPC library design, emphasizing efficient host-based optimizations as a feasible pathway instead of embedding functionalities into the network.
  • Research Speculations: The emergence of eRPC suggests a promising direction for further research into general-purpose networking solutions that marry high performance with operational simplicity, especially as datacenter architectures continue to evolve. Future developments may focus on refining congestion control and further integrating such libraries with cloud-native environments.

In conclusion, eRPC provides a compelling case for reconsidering the necessity of highly specialized network features for achieving high performance, offering a pragmatic approach that meets the needs of modern datacenter applications while preserving generality. This paper sheds light on the potential for software-based solutions to drive forward advancements in datacenter networking performance.