ShuffleBench: A Benchmark for Large-Scale Data Shuffling Operations with Distributed Stream Processing Frameworks (2403.04570v1)
Abstract: Distributed stream processing frameworks help building scalable and reliable applications that perform transformations and aggregations on continuous data streams. This paper introduces ShuffleBench, a novel benchmark to evaluate the performance of modern stream processing frameworks. In contrast to other benchmarks, it focuses on use cases where stream processing frameworks are mainly employed for shuffling (i.e., re-distributing) data records to perform state-local aggregations, while the actual aggregation logic is considered as black-box software components. ShuffleBench is inspired by requirements for near real-time analytics of a large cloud observability platform and takes up benchmarking metrics and methods for latency, throughput, and scalability established in the performance engineering research community. Although inspired by a real-world observability use case, it is highly configurable to allow domain-independent evaluations. ShuffleBench comes as a ready-to-use open-source software utilizing existing Kubernetes tooling and providing implementations for four state-of-the-art frameworks. Therefore, we expect ShuffleBench to be a valuable contribution to both industrial practitioners building stream processing applications and researchers working on new stream processing approaches. We complement this paper with an experimental performance evaluation that employs ShuffleBench with various configurations on Flink, Hazelcast, Kafka Streams, and Spark in a cloud-native environment. Our results show that Flink achieves the highest throughput while Hazelcast processes data streams with the lowest latency.
- Stateful Functions as a Service in Action. Proceedings of the VLDB Endowment 12, 12 (Aug. 2019), 1890–1893. https://doi.org/10.14778/3352063.3352092
- The Dataflow Model: A Practical Approach to Balancing Correctness, Latency, and Cost in Massive-Scale, Unbounded, out-of-Order Data Processing. Proceedings of the VLDB Endowment 8, 12 (Aug. 2015), 1792–1803. https://doi.org/10.14778/2824032.2824076
- Linear Road: A Stream Data Management Benchmark. In Proceedings of the Thirtieth International Conference on Very Large Data Bases - Volume 30 (VLDB ’04). VLDB Endowment, 480–491.
- Structured Streaming: A Declarative API for Real-Time Applications in Apache Spark. In Proceedings of the 2018 International Conference on Management of Data (SIGMOD ’18). ACM, 601–613. https://doi.org/10.1145/3183713.3190664
- DSPBench: A Suite of Benchmark Applications for Distributed Data Stream Processing Systems. IEEE Access 8 (2020), 222900–222917. https://doi.org/10.1109/ACCESS.2020.3043948
- Apache Flink: Stream and batch processing in a single engine. Bulletin of the IEEE Computer Society Technical Committee on Data Engineering 36, 4 (2015).
- Accurate latency estimation in a distributed event processing system. In 2011 IEEE 27th International Conference on Data Engineering. IEEE, 255–266. https://doi.org/10.1109/ICDE.2011.5767926
- Jeffrey Dean and Sanjay Ghemawat. 2008. MapReduce: Simplified Data Processing on Large Clusters. Commun. ACM 51, 1 (Jan. 2008), 107–113. https://doi.org/10.1145/1327452.1327492
- A survey on the evolution of stream processing systems. The VLDB Journal 33, 2 (2024), 507–541. https://doi.org/10.1007/s00778-023-00819-8
- Hazelcast Jet: Low-Latency Stream Processing at the 99.99th Percentile. Proceedings of the VLDB Endowment 14, 12 (July 2021), 3110–3121. https://doi.org/10.14778/3476311.3476387
- Wilhelm Hasselbring. 2021. Benchmarking as Empirical Standard in Software Engineering Research. In Evaluation and Assessment in Software Engineering (EASE ’21). ACM, 457–462. https://doi.org/10.1145/3463274.3463361
- Sören Henning and Wilhelm Hasselbring. 2021a. How to Measure Scalability of Distributed Stream Processing Engines?. In Companion of the ACM/SPEC International Conference on Performance Engineering. ACM. https://doi.org/10.1145/3447545.3451190
- Sören Henning and Wilhelm Hasselbring. 2021b. Theodolite: Scalability Benchmarking of Distributed Stream Processing Engines in Microservice Architectures. Big Data Research 25 (2021), 100209. https://doi.org/10.1016/j.bdr.2021.100209
- Sören Henning and Wilhelm Hasselbring. 2022a. A Configurable Method for Benchmarking Scalability of Cloud-Native Applications. Empirical Software Engineering 27, 6 (Aug. 2022). https://doi.org/10.1007/s10664-022-10162-1
- Sören Henning and Wilhelm Hasselbring. 2022b. Demo Paper: Benchmarking Scalability of Cloud-Native Applications with Theodolite. In 2022 IEEE International Conference on Cloud Engineering (IC2E). IEEE, 275–276. https://doi.org/10.1109/IC2E55432.2022.00037
- Sören Henning and Wilhelm Hasselbring. 2024. Benchmarking scalability of stream processing frameworks deployed as microservices in the cloud. Journal of Systems and Software 208 (2024), 111879. https://doi.org/10.1016/j.jss.2023.111879
- Replication Package for: ShuffleBench: A Benchmark for Large-Scale Data Shuffling Operations with Distributed Stream Processing Frameworks. https://doi.org/10.5281/zenodo.10605615
- ESPBench: The Enterprise Stream Processing Benchmark. In Proceedings of the ACM/SPEC International Conference on Performance Engineering (ICPE ’21). ACM, 201–212. https://doi.org/10.1145/3427921.3450242
- A catalog of stream processing optimizations. Comput. Surveys 46, 4, Article 46 (March 2014), 34 pages. https://doi.org/10.1145/2528412
- A Comparison of Stream Processing Frameworks. In 2017 International Conference on Computer and Applications (ICCA). IEEE, 1–12. https://doi.org/10.1109/COMAPP.2017.8079733
- Benchmarking Distributed Stream Data Processing Systems. In 2018 IEEE 34th International Conference on Data Engineering (ICDE). IEEE, 1507–1518. https://doi.org/10.1109/ICDE.2018.00169
- Asterios Katsifodimos and Marios Fragkoulis. 2019. Operational Stream Processing: Towards Scalable and Consistent Event-Driven Applications. In Advances in Database Technology - 22nd International Conference on Extending Database Technology. OpenProceedings.org, 682–685. https://doi.org/10.5441/002/edbt.2019.86
- Systems Benchmarking. Springer.
- Kafka: A distributed messaging system for log processing. In Proceedings of the International Workshop on Networking Meets Databases.
- Philipp Leitner and Jürgen Cito. 2016. Patterns in the Chaos—A Study of Performance Variation and Predictability in Public IaaS Clouds. ACM Transactions on Internet Technology 16, 3, Article 15 (April 2016), 23 pages. https://doi.org/10.1145/2885497
- Evaluation of distributed stream processing frameworks for IoT applications in Smart Cities. Journal of Big Data 6, 52 (2019). https://doi.org/10.1186/s40537-019-0215-2
- NAMB: A Quick and Flexible Stream Processing Application Prototype Generator. In 2020 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID). IEEE, 61–70. https://doi.org/10.1109/CCGrid49817.2020.00-87
- Methodological Principles for Reproducible Performance Evaluation in Cloud Computing. IEEE Transactions on Software Engineering 47, 8 (2021), 1528–1543. https://doi.org/10.1109/TSE.2019.2927908
- Streaming vs. Functions: A Cost Perspective on Cloud Event Processing. In 2022 IEEE International Conference on Cloud Engineering (IC2E). IEEE, 67–78. https://doi.org/10.1109/IC2E55432.2022.00015
- Empirical Standards for Software Engineering Research. https://doi.org/10.48550/arXiv.2010.03525 Version 0.2.0.
- Streams and Tables: Two Sides of the Same Coin. In Proceedings of the International Workshop on Real-Time Business Intelligence and Analytics (BIRTE ’18). ACM, Article 1, 10 pages. https://doi.org/10.1145/3242153.3242155
- JetStream: Enabling high throughput live event streaming on multi-site clouds. Future Generation Computer Systems 54 (2016), 274–291. https://doi.org/10.1016/j.future.2015.01.016
- How to Build a Benchmark. In Proceedings of the 6th ACM/SPEC International Conference on Performance Engineering. ACM, 333–336. https://doi.org/10.1145/2668930.2688819
- Giselle van Dongen and Dirk van den Poel. 2020. Evaluation of Stream Processing Frameworks. IEEE Transactions on Parallel and Distributed Systems 31, 8 (2020), 1845–1858. https://doi.org/10.1109/TPDS.2020.2978480
- Giselle van Dongen and Dirk van den Poel. 2021a. Influencing Factors in the Scalability of Distributed Stream Processing Jobs. IEEE Access 9 (2021), 109413–109431. https://doi.org/10.1109/ACCESS.2021.3102645
- Giselle van Dongen and Dirk van den Poel. 2021b. A Performance Analysis of Fault Recovery in Stream Processing Frameworks. IEEE Access 9 (2021), 93745–93763. https://doi.org/10.1109/ACCESS.2021.3093208
- A systematic mapping of performance in distributed stream processing systems. In Euromicro Conference on Software Engineering and Advanced Applications. IEEE. https://doi.org/10.1109/SEAA60479.2023.00052
- Consistency and Completeness: Rethinking Distributed Stream Processing in Apache Kafka. In Proceedings of the 2021 International Conference on Management of Data (SIGMOD/PODS ’21). ACM, 2602–2613. https://doi.org/10.1145/3448016.3457556
- Apache Spark: A Unified Engine for Big Data Processing. Commun. ACM 59, 11 (Oct. 2016), 56–65. https://doi.org/10.1145/2934664