The Cost of Garbage Collection for State Machine Replication (2405.11182v1)
Abstract: State Machine Replication (SMR) protocols form the backbone of many distributed systems. Enterprises and startups increasingly build their distributed systems on the cloud due to its many advantages, such as scalability and cost-effectiveness. One of the first technical questions companies face when building a system on the cloud is which programming language to use. Among many factors that go into this decision is whether to use a language with garbage collection (GC), such as Java or Go, or a language with manual memory management, such as C++ or Rust. Today, companies predominantly prefer languages with GC, like Go, Kotlin, or even Python, due to ease of development; however, there is no free lunch: GC costs resources (memory and CPU) and performance (long tail latencies due to GC pauses). While there have been anecdotal reports of reduced cloud cost and improved tail latencies when switching from a language with GC to a language with manual memory management, so far, there has not been a systematic study of the GC overhead of running an SMR-based cloud system. This paper studies the overhead of running an SMR-based cloud system written in a language with GC. To this end, we design from scratch a canonical SMR system -- a MultiPaxos-based replicated in-memory key-value store -- and we implement it in C++, Java, Rust, and Go. We compare the performance and resource usage of these implementations when running on the cloud under different workloads and resource constraints and report our results. Our findings have implications for the design of cloud systems.
- Dissecting the Performance of Strongly-Consistent Replication Protocols. In Proceedings of the 2019 International Conference on Management of Data (Amsterdam, Netherlands) (SIGMOD ’19). Association for Computing Machinery, New York, NY, USA, 1696–1710. https://doi.org/10.1145/3299869.3319893
- Treehouse: A Case For Carbon-Aware Datacenter Software. In Workshop on Sustainable Computer Systems Design and Implementation. https://www.microsoft.com/en-us/research/publication/treehouse-a-case-for-carbon-aware-datacenter-software/
- CPU hotplug in the Kernel. https://docs.kernel.org/core-api/cpuhotplug.html.
- Go Authors. 2023a. A Guide to the Go Garbage Collector. https://tip.golang.org/doc/gc-guide.
- GNU C Library Authors. 2023b. Overview of Malloc. https://sourceware.org/glibc/wiki/MallocInternals.
- Tonic Authors. 2023c. A rust implementation of gRPC. https://github.com/hyperium/tonic.
- TiKV Authors. 2023d. TiKV is a highly scalable, low latency, and easy to use key-value database. https://tikv.org.
- Megastore: Providing Scalable, Highly Available Storage for Interactive Services. In Proceedings of the Conference on Innovative Data system Research (CIDR). 223–234. http://www.cidrdb.org/cidr2011/Papers/CIDR11_Paper32.pdf
- Paxos Replicated State Machines as the Basis of a High-Performance Data Store. In 8th USENIX Symposium on Networked Systems Design and Implementation (NSDI 11). USENIX Association, Boston, MA. https://www.usenix.org/conference/nsdi11/paxos-replicated-state-machines-basis-high-performance-data-store
- Millions of Tiny Databases. In 17th USENIX Symposium on Networked Systems Design and Implementation (NSDI 20). USENIX Association, Santa Clara, CA, 463–478. https://www.usenix.org/conference/nsdi20/presentation/brooker
- Mike Burrows. 2006. The Chubby Lock Service for Loosely-Coupled Distributed Systems. In Proceedings of the 7th Symposium on Operating Systems Design and Implementation (Seattle, Washington) (OSDI ’06). USENIX Association, USA, 335–350.
- Sean Busbey. 2015. Core Workloads. https://github.com/brianfrankcooper/YCSB/wiki/Core-Workloads.
- Paxos Made Live: An Engineering Perspective. In Proceedings of the Twenty-Sixth Annual ACM Symposium on Principles of Distributed Computing (Portland, Oregon, USA) (PODC ’07). Association for Computing Machinery, New York, NY, USA, 398–407. https://doi.org/10.1145/1281100.1281103
- Rosetta Code. 2022. Rosetta Code — Rosetta Code,. https://rosettacode.org/w/index.php?title=Rosetta_Code&oldid=322370 [Online; accessed 17-April-2023].
- Benchmarking Cloud Serving Systems with YCSB. In Proceedings of the 1st ACM Symposium on Cloud Computing (Indianapolis, Indiana, USA) (SoCC ’10). Association for Computing Machinery, New York, NY, USA, 143–154. https://doi.org/10.1145/1807128.1807152
- Spanner: Google’s Globally-Distributed Database. In OSDI.
- Towards a Green Ranking for Programming Languages. In Proceedings of the 21st Brazilian Symposium on Programming Languages (Fortaleza, CE, Brazil) (SBLP ’17). Association for Computing Machinery, New York, NY, USA, Article 7, 8 pages. https://doi.org/10.1145/3125374.3125382
- Russ Cox. 2008. preliminary network - just Dial for now. https://github.com/golang/go/blob/e8a02230f215efb075cccd4146b3d0d1ada4870e/src/lib/net/net.go#L398.
- cppreference.com. 2023a. Coroutines (C++20). https://en.cppreference.com/w/cpp/language/coroutines.
- cppreference.com. 2023b. RAII. https://en.cppreference.com/w/cpp/language/raii.
- The benefits and costs of writing a POSIX kernel in a high-level language. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18). USENIX Association, Carlsbad, CA, 89–105. https://www.usenix.org/conference/osdi18/presentation/cutler
- Jean de Klerk. 2023. gRPC on HTTP/2 Engineering a Robust, High-performance Protocol. https://grpc.io/blog/grpc-on-http2/.
- Jeffrey Dean and Luiz André Barroso. 2013. The Tail at Scale. Commun. ACM 56 (2013), 74–80. http://cacm.acm.org/magazines/2013/2/160173-the-tail-at-scale/fulltext
- delos Authors. 2019. Delos: Simple, flexible storage for the Facebook control plane. https://engineering.fb.com/2019/06/06/data-center-engineering/delos.
- Amazon DynamoDB: A Scalable, Predictably Performant, and Fully Managed NoSQL Database Service. In 2022 USENIX Annual Technical Conference (USENIX ATC 22). USENIX Association, Carlsbad, CA, 1037–1048. https://www.usenix.org/conference/atc22/presentation/elhemali
- etcd Authors. 2023. etcd: A distributed, reliable key-value store for the most critical data of a distributed system. https://etcd.io/.
- Jason Evans. 2023. Scalable memory allocation using jemalloc. https://engineering.fb.com/2011/01/03/core-data/scalable-memory-allocation-using-jemalloc/.
- The Computer Language Benchmarks Game. 2022. The Computer Language Benchmarks Game. https://sschakraborty.github.io/benchmark/index.html
- An Open-Source Benchmark Suite for Microservices and Their Hardware-Software Implications for Cloud & Edge Systems. In Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems (Providence, RI, USA) (ASPLOS ’19). Association for Computing Machinery, New York, NY, USA, 3–18. https://doi.org/10.1145/3297858.3304013
- Google. 2023. gRPC: A high performance, open source universal RPC framework. https://grpc.io/.
- Mark Gritter. 2021. Taming Go’s memory usage, or how we avoided rewriting our client in Rust. https://www.akitasoftware.com/blog-posts/taming-gos-memory-usage-or-how-we-avoided-rewriting-our-client-in-rust.
- Kerem Gulen. 2022. Cloud costs have started to become a heavy burden for the IT sector. https://dataconomy.com/2022/09/cloud-costs-are-skyrocketing-it-industry/.
- h2 Authors. 2023a. h2: A Tokio aware, HTTP/2 client & server implementation for Rust. https://github.com/hyperium/h2.
- h2 Authors. 2023b. Streams Mutex. https://github.com/hyperium/h2/blob/a6b414458fd7687f53df68861f4833cf142e5b76/src/proto/streams/streams.rs#L25.
- Jesse Howarth. 2021. Why Discord is switching from Go to Rust. https://blog.discord.com/why-discord-is-switching-from-go-to-rust-a190bbca2b1f.
- Robert Hundt. 2011a. Loop Recognition in C++/Java/Go/Scala. In Proceedings of Scala Days 2011. https://days2011.scala-lang.org/sites/days2011/files/ws3-1-Hundt.pdf
- Robert Hundt. 2011b. Loop Recognition in C++/Java/Go/Scala. In Proceedings of Scala Days 2011. https://days2011.scala-lang.org/sites/days2011/files/ws3-1-Hundt.pdf
- ZooKeeper: Wait-Free Coordination for Internet-Scale Systems. In Proceedings of the 2010 USENIX Conference on USENIX Annual Technical Conference (Boston, MA) (USENIXATC’10). USENIX Association, USA, 11.
- Bo Ingram. 2023. How Discord Stores Trillions of Messages. https://discord.com/blog/how-discord-stores-trillions-of-messages.
- java Authors. 2022. Garbage-First (G1) Garbage Collector. https://docs.oracle.com/en/java/javase/17/gctuning/garbage-first-g1-garbage-collector1.html#GUID-ED3AB6D3-FD9B-4447-9EDF-983ED2F7A573.
- Steve Klabnik and Carol Nichols. 2023. The Rust Programming Language. https://doc.rust-lang.org/stable/book/.
- Niraj Kothari. 2023. Case Study: Fanatics Relies on ScyllaDB to Power Always-On Ecommerce for Millions of Fans. https://www.scylladb.com/users/case-study-fanatics-relies-on-scylla-to-power-always-on-ecommerce-for-millions-of-fans/.
- Avinash Lakshman and Prashant Malik. 2010. Cassandra: A Decentralized Structured Storage System. SIGOPS Oper. Syst. Rev. 44, 2 (apr 2010), 35–40. https://doi.org/10.1145/1773912.1773922
- Leslie Lamport. 1998. The Part-Time Parliament. ACM Trans. Comput. Syst. 16, 2 (may 1998), 133–169. https://doi.org/10.1145/279227.279229
- Leslie Lamport. 2001. Paxos Made Simple. ACM SIGACT News (Distributed Computing Column) 32, 4 (Whole Number 121, December 2001) (December 2001), 51–58. https://www.microsoft.com/en-us/research/publication/paxos-made-simple/
- Just Say NO to Paxos Overhead: Replacing Consensus with Network Ordering.. In OSDI. 467–483.
- Investigating Managed Language Runtime Performance: Why JavaScript and Python are 8x and 29x slower than C++, yet Java and Go can be Faster?. In 2022 USENIX Annual Technical Conference (USENIX ATC 22). USENIX Association, Carlsbad, CA, 835–852. https://www.usenix.org/conference/atc22/presentation/lion
- Return of the Runtimes: Rethinking the Language Runtime System for the Cloud 3.0 Era. In Proceedings of the 16th Workshop on Hot Topics in Operating Systems (Whistler, BC, Canada) (HotOS ’17). Association for Computing Machinery, New York, NY, USA, 138–143. https://doi.org/10.1145/3102980.3103003
- Mencius: Building Efficient Replicated State Machines for WANs. In Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation (San Diego, California) (OSDI’08). USENIX Association, USA, 369–384.
- Scalable but Wasteful: Current State of Replication in the Cloud. In Proceedings of the 13th ACM Workshop on Hot Topics in Storage and File Systems (Virtual, USA) (HotStorage ’21). Association for Computing Machinery, New York, NY, USA, 42–49. https://doi.org/10.1145/3465332.3470882
- mongodb Authors. 2009. MongoDB: The Developer Data Platform — MongoDB. https://www.mongodb.com/.
- There is More Consensus in Egalitarian Parliaments. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles (Farminton, Pennsylvania) (SOSP ’13). Association for Computing Machinery, New York, NY, USA, 358–372. https://doi.org/10.1145/2517349.2517350
- A comparison of Java, C/C++, and FORTRAN for numerical computing. IEEE Antennas and Propagation Magazine 40, 5 (1998), 102–105. https://doi.org/10.1109/74.736311
- John Nagle. 1984. Congestion Control in IP/TCP Internetworks. https://datatracker.ietf.org/doc/html/rfc896.
- John Nagle. 2015. That still irks me. The real problem is not tinygram prevention. https://github.com/golang/go/blob/e8a02230f215efb075cccd4146b3d0d1ada4870e/src/lib/net/net.go#L398.
- Brian M Oki and Barbara H Liskov. 1988. Viewstamped replication: A new primary copy method to support highly-available distributed systems. In Proceedings of the seventh annual ACM Symposium on Principles of distributed computing. 8–17.
- Diego Ongaro and John Ousterhout. 2014. In Search of an Understandable Consensus Algorithm. In 2014 USENIX Annual Technical Conference (USENIX ATC 14). USENIX Association, Philadelphia, PA, 305–319. https://www.usenix.org/conference/atc14/technical-sessions/presentation/ongaro
- Oracle. 2023. Loom - Fibers, Continuations and Tail-Calls for the JVM. https://openjdk.org/projects/loom/.
- Programming Rust, 2nd Edition. https://learning.oreilly.com/library/view/programming-rust-2nd/9781492052586/.
- Energy Efficiency across Programming Languages: How Do Energy, Time, and Memory Relate?. In Proceedings of the 10th ACM SIGPLAN International Conference on Software Language Engineering (Vancouver, BC, Canada) (SLE 2017). Association for Computing Machinery, New York, NY, USA, 256–267. https://doi.org/10.1145/3136014.3136031
- Ranking programming languages by energy efficiency. Science of Computer Programming 205 (2021), 102609. https://doi.org/10.1016/j.scico.2021.102609
- Geoffrey Phipps. 1999. Comparing observed bug and productivity rates for Java and C++. Software: Practice and Experience 29, 4 (1999), 345–358.
- Lutz Prechelt. 2000. An empirical comparison of C, C++, Java, Perl, Python, Rexx and Tcl. IEEE Computer 33, 10 (2000), 23–29.
- Ron Pressler. 2022. Why User-Mode Threads Are Good for Performance. https://www.youtube.com/watch?v=07V08SB1l8c.
- Marco Primi and Daniele Sciascia. 2013. LibPaxos. https://libpaxos.sourceforge.net/.
- Alice Ryhl. 2022. Shared mutable state in Rust. https://draft.ryhl.io/blog/shared-mutable-state/.
- Fred B. Schneider. 1990. Implementing Fault-Tolerant Services Using the State Machine Approach: A Tutorial. ACM Comput. Surv. 22, 4 (dec 1990), 299–319. https://doi.org/10.1145/98163.98167
- Open Versus Closed: A Cautionary Tale. In 3rd Symposium on Networked Systems Design & Implementation (NSDI 06). USENIX Association, San Jose, CA. https://www.usenix.org/conference/nsdi-06/open-versus-closed-cautionary-tale
- AddressSanitizer: A Fast Address Sanity Checker. In Proceedings of the 2012 USENIX Conference on Annual Technical Conference (Boston, MA) (USENIX ATC’12). USENIX Association, USA, 28.
- Carl Lerche Shane Miller. 2022. Sustainability with Rust. https://aws.amazon.com/blogs/opensource/sustainability-with-rust.
- Richard L Sites. 2021. Understanding Software Dynamics. Addison-Wesley Professional.
- CockroachDB: The Resilient Geo-Distributed SQL Database. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data (SIGMOD 2020) (Portland, OR, USA) (SIGMOD ’20). Association for Computing Machinery, New York, NY, USA, 1493–1509. https://doi.org/10.1145/3318464.3386134
- Gil Tene. 2013. How NOT to Measure Latency. https://www.infoq.com/presentations/latency-response-time/.
- Tokio. 2023. A runtime for writing reliable network applications without compromising speed. https://docs.rs/tokio/latest/tokio/.
- Robbert Van Renesse and Deniz Altinbuken. 2015. Paxos Made Moderately Complex. ACM Comput. Surv. 47, 3, Article 42 (feb 2015), 36 pages. https://doi.org/10.1145/2673577
- Large-scale cluster management at Google with Borg. In Proceedings of the European Conference on Computer Systems (EuroSys). Bordeaux, France.
- Nitsan Wakart. 2015a. CHANGES. https://github.com/brianfrankcooper/YCSB/blob/master/core/CHANGES.md.
- Nitsan Wakart. 2015b. Correcting YCSB’s Coordinated Omission problem. https://psy-lob-saw.blogspot.com/2015/03/fixing-ycsb-coordinated-omission.html.
- Sarah Wang and Martin Casado. 2020. The Cost of Cloud, a Trillion Dollar Paradox. https://a16z.com/2021/05/27/cost-of-cloud-paradox-market-cap-cloud-lifecycle-scale-growth-repatriation-optimization/.
- Ceph: A Scalable, High-Performance Distributed File System. In 7th USENIX Symposium on Operating Systems Design and Implementation (OSDI 06). USENIX Association, Seattle, WA. https://www.usenix.org/conference/osdi-06/ceph-scalable-high-performance-distributed-file-system
- Matt Welsh. 2022. Using Rust at a startup: A cautionary tale. https://mdwdotla.medium.com/using-rust-at-a-startup-a-cautionary-tale-42ab823d9454.
- Michael Whittaker. 2021. FrankenPaxos. https://github.com/mwhittaker/frankenpaxos.
- Wikipedia. 2023a. Monitor. https://en.wikipedia.org/wiki/Monitor(synchronization).
- Wikipedia. 2023b. Response time. https://en.wikipedia.org/wiki/Responsetime(technology).
- Julien Oleg Willard. 2022. How to respond to the increasing costs of cloud: a CIO guide. https://www.ibm.com/blogs/internet-of-things/increasing-cloud-costs-cio-guide/.
- A large scale analysis of hundreds of in-memory cache clusters at Twitter. In 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20). USENIX Association, 191–208. https://www.usenix.org/conference/osdi20/presentation/yang
- yugabytedb Authors. 2016. YugabyteDB—The Distributed SQL Database for Mission. https://www.yugabyte.com/.
- zeus Authors. 2016. Zeus - Build dashboards from an SQL database. https://https://www.zeusdash.com/.
- Treadmill: Attributing the source of tail latency through precise load testing and statistical inference. ACM SIGARCH Computer Architecture News 44, 3 (2016), 456–468.
- PaxosStore: High-Availability Storage Made Practical in WeChat. Proc. VLDB Endow. 10, 12 (aug 2017), 1730–1741. https://doi.org/10.14778/3137765.3137778
- FoundationDB: A Distributed Unbundled Transactional Key Value Store. In Proceedings of the 2021 International Conference on Management of Data (Virtual Event, China) (SIGMOD ’21). Association for Computing Machinery, New York, NY, USA, 2653–2666. https://doi.org/10.1145/3448016.3457559