2000 character limit reached
NotNets: Accelerating Microservices by Bypassing the Network (2404.06581v1)
Published 9 Apr 2024 in cs.DC
Abstract: Remote procedure calls are the workhorse of distributed systems. However, as software engineering trends, such as micro-services and serverless computing, push applications towards ever finer-grained decompositions, the overhead of RPC-based communication is becoming too great to bear. In this paper, we argue that point solutions that attempt to optimize one aspect of RPC logic are unlikely to mitigate these ballooning communication costs. Rather, we need a dramatic reappraisal of how we provide communication. Towards this end, we propose to emulate message-passing RPCs by sharing message payloads and metadata on CXL 3.0-backed far memory. We provide initial evidence of feasibility and analyze the expected benefits.
- Intel Rack Scale Design (Intel RSD). https://www.intel.com/content/www/us/en/architecture-and-technology/rack-scale-design-overview.html.
- Scaling up the prime video audio/video monitoring service and reducing costs by 90
- Remote regions: a simple abstraction for remote memory. In 2018 USENIX Annual Technical Conference (USENIX ATC 18), pages 775–787, July 2018.
- Can far memory improve job throughput? In Proceedings of the Fifteenth European Conference on Computer Systems, EuroSys ’20, 2020.
- R. Bisiani and M. Ravishankar. Plus: a distributed shared-memory system. In Proceedings. The 17th Annual International Symposium on Computer Architecture, pages 115–124, May 1990.
- Accelerating tensorflow with adaptive rdma-based grpc. In 2018 IEEE 25th International Conference on High Performance Computing (HiPC), pages 2–11, 2018.
- The linda® alternative to message-passing systems. Parallel Computing, 20(4):633–655, 1994.
- Implementation and performance of munin. In Proceedings of the Thirteenth ACM Symposium on Operating Systems Principles, SOSP ’91, pages 152–164, 1991.
- Compute express link. https://www.computeexpresslink.org, 2023.
- FaRM: Fast Remote Memory. In 11th USENIX Symposium on Networked Systems Design and Implementation (NSDI), pages 401–414, Apr. 2014.
- An open-source benchmark suite for microservices and their hardware-software implications for cloud & edge systems. In Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS ’19, page 3–18, 2019.
- Network requirements for resource disaggregation. In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation, OSDI’16, page 249–264, 2016.
- Memory pooling with cxl. IEEE Micro, 43(2):48–57, 2023.
- grpc. https://grpc.io, 2023.
- Efficient memory disaggregation with infiniswap. In 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI 17), pages 649–667, Mar. 2017.
- M. Herlihy. Wait-free synchronization. ACM Trans. Program. Lang. Syst., 13(1):124–149, Jan. 1991.
- Hyperprotobench. https://github.com/google/HyperProtoBench, 2022.
- Intel NVMe with 3D XPoint Technology chart. https://www.tomshardware.com/reviews/intel-micron-3d-xpoint-updates,4286.html#p1, 2015.
- Intel Skylake. https://www.7-cpu.com/cpu/Skylake.html, 2019.
- Intel Xeon Processor E7-8893 v3. https://ark.intel.com/content/www/us/en/ark/products/84688/intel-xeon-processor-e7-8893-v3-45m-cache-3-20-ghz.html, 2019.
- Using rdma efficiently for key-value services. In Proceedings of the 2014 ACM Conference on SIGCOMM, SIGCOMM ’14, page 295–306, 2014.
- A hardware accelerator for protocol buffers. In MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO ’21, page 462–478, 2021.
- Flash storage disaggregation. In Proceedings of the Eleventh European Conference on Computer Systems, EuroSys ’16, 2016.
- L. Lamport. The Part-time Parliament. ACM TOCS, 16(2):133–169, May 1998.
- Understanding Rack-Scale disaggregated storage. In 9th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage 17), July 2017.
- The dash prototype: Implementation and performance. In 25 Years of the International Symposia on Computer Architecture (Selected Papers), ISCA ’98, pages 418–429, 1998.
- Disaggregated architecture for at scale computing. In V. Chang, M. Ramachandran, G. B. Wills, R. J. Walters, V. Kantere, and C. Li, editors, ESaaSA 2015 - Proceedings of the 2nd International Workshop on Emerging Software as a Service and Analytics, Lisbon, Portugal, 20-22 May, 2015, pages 45–52, 2015.
- Pond: Cxl-based memory pooling systems for cloud platforms, 2022.
- Hatrpc: Hint-accelerated thrift rpc over rdma. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC ’21, 2021.
- Disaggregated memory for expansion and sharing in blade servers. In Proceedings of the 36th Annual International Symposium on Computer Architecture, ISCA ’09, page 267–278, 2009.
- P. Mehra and T. Coughlin. Taming memory with disaggregation. Computer, 55(9):94–98, 2022.
- A Primer on Memory Consistency and Cache Coherence. 2nd edition, 2020.
- Microservices Adoption in 2020. https://www.oreilly.com/radar/microservices-adoption-in-2020/, 2020.
- Optimus prime: Accelerating data transformation in servers. In Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS ’20, page 1203–1216, 2020.
- AIFM: High-Performance, Application-Integrated far memory. In 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20), pages 315–332, Nov. 2020.
- F. B. Schneider. Implementing Fault-Tolerant Services Using the State Machine Approach: A Tutorial. ACM Computing Surveys (CSUR), 22:299–319, Dec. 1990.
- DRAM Errors in the Wild: A Large-scale Field Study. PER, 37(1):193–204, June 2009.
- A. Sriraman and A. Dhanotia. Accelerometer: Understanding acceleration opportunities for data center overheads at hyperscale. ASPLOS ’20, page 733–750, 2020.
- Rfp: When rpc is faster than server-bypass with rdma. In Proceedings of the Twelfth European Conference on Computer Systems, EuroSys ’17, page 1–15, 2017.
- Zerializer: Towards zero-copy serialization. In Proceedings of the Workshop on Hot Topics in Operating Systems, HotOS ’21, page 206–212, 2021.
- Carbink: Fault-Tolerant far memory. In 16th USENIX Symposium on Operating Systems Design and Implementation (OSDI 22), pages 55–71, July 2022.