Grand Perspective: Load Shedding in Distributed CEP Applications (2309.17183v1)
Abstract: In distributed Complex Event Processing (CEP) applications with high load but limited resources, bottleneck operators in the operator graph can significantly slow down processing of event streams, thus compelling the need to shed load. A high-quality load shedding strategy that resolves the bottleneck with high output quality evaluates each event's importance with regards to the application's final output and drops less important events from the event stream for the benefit of important ones. So far, no solution has been proposed that is able to permit good load shedding in distributed, multi-operator CEP applications. On one hand, shedding strategies have been proposed for single-operator CEP applications that can measure an event's importance immediately at the bottleneck operator, only, and thereby ignore the effect of other streams in the application on an event's importance. On the other hand, shedding strategies have been proposed for applications with multiple operators from the area of stream processing that provide a fixed selectivity which is not given in the conditional CEP operators. We, therefore, propose a load-shedding solution for distributed CEP applications that maximizes the application's final output and ensures timely processing of important events by using a set of CEP-tailored selectivity functions and a linear program, which is an abstraction of the CEP application. Moreover, our solution ensures a quality optimal shedder configuration even in the presence of dynamically changing conditions. With the help of extensive evaluations on both synthetic and real data, we show that our solution successfully resolves overload at bottleneck operators and at the same time maximizes the quality of the application's output.
- Optimal operator deployment and replication for elastic distributed data stream processing - cardellini - 2017 - concurrency and computation: Practice and experience - wiley online library.
- Snoop: An expressive event specification language for active databases. Data & Knowledge Engineering, 14(1):1–26, 1994.
- Leaving stragglers at the window: low-latency stream sampling with accuracy guarantees. In Julien Gascon-Samson, Kaiwen Zhang, Khuzaima Daudjee, and Bettina Kemme, editors, 14th ACM International Conference on Distributed and Event-based Systems, DEBS 2020, Montreal, Quebec, Canada, July 13-17, 2020, pages 15–26. ACM, 2020.
- Pipelined fission for stream programs with dynamic selectivity and partitioned state. Journal of Parallel and Distributed Computing, 96:106–120, 2016.
- A survey on automatic parameter tuning for big data processing systems. 53(2):43:1–43:37.
- A catalog of stream processing optimizations. 46(4):46:1–46:34.
- Concept-driven load shedding: Reducing size and error of voluminous and variable data streams. In 2018 IEEE International Conference on Big Data (Big Data), pages 418–427. IEEE, 2018.
- High-performance nested cep query processing over event streams. In 2011 IEEE 27th International Conference on Data Engineering, pages 123–134, 2011.
- Elastic stream processing with latency guarantees. In 2015 IEEE 35th International Conference on Distributed Computing Systems, pages 399–410.
- Predictable low-latency event detection with parallel complex event processing. 2(4):274–286.
- MigCEP: Operator migration for mobility driven distributed complex event processing. In Proceedings of the 7th ACM International Conference on Distributed Event-based Systems, DEBS ’13, pages 183–194. ACM.
- A framework for decentralized parallel complex event processing on heterogeneous infrastructures. In 2021 IEEE International Conference on Big Data (Big Data), Orlando, FL, USA, December 15-18, 2021, pages 190–196. IEEE, 2021.
- Combining it all: Cost minimal and low-latency stream processing across distributed heterogeneous infrastructures. In Proceedings of the 20th International Middleware Conference, pages 255–267, 2019.
- Reinforcement learning based policies for elastic stream processing on heterogeneous resources. In Proceedings of the 13th ACM International Conference on Distributed and Event-based Systems, DEBS ’19, pages 31–42. ACM. event-place: Darmstadt, Germany.
- Utility-aware load shedding for real-time video analytics at the edge. CoRR, abs/2307.02409, 2023.
- Distributed complex event processing with query rewriting. In Proceedings of the Third ACM International Conference on Distributed Event-Based Systems, DEBS ’09, New York, NY, USA, 2009. Association for Computing Machinery.
- pSPICE: Partial match shedding for complex event processing. page 10.
- espice: Probabilistic load shedding from input event streams in complex event processing. In Proceedings of the 20th International Middleware Conference, pages 215–227, 2019.
- hSPICE: state-aware event shedding in complex event processing. In Julien Gascon-Samson, Kaiwen Zhang, Khuzaima Daudjee, and Bettina Kemme, editors, 14th ACM International Conference on Distributed and Event-based Systems, DEBS 2020, Montreal, Quebec, Canada, July 13-17, 2020, pages 109–120. ACM, 2020.
- State-aware load shedding from input event streams in complex event processing. IEEE Trans. Big Data, 8(5):1340–1357, 2022.
- gSPICE: Model-based event shedding in complex event processing. CoRR, abs/2309.16405, 2023.
- Staying FIT: Efficient load shedding techniques for distributed stream processing. In Proceedings of the 33rd International Conference on Very Large Data Bases, VLDB ’07, pages 159–170. VLDB Endowment.
- Load shedding for complex event processing: Input-based and state-based techniques. In 2020 IEEE 36th International Conference on Data Engineering (ICDE), pages 1093–1104. IEEE.