Evaluating the Ability of LLMs to Solve Semantics-Aware Process Mining Tasks (2407.02310v1)
Abstract: The process mining community has recently recognized the potential of LLMs for tackling various process mining tasks. Initial studies report the capability of LLMs to support process analysis and even, to some extent, that they are able to reason about how processes work. This latter property suggests that LLMs could also be used to tackle process mining tasks that benefit from an understanding of process behavior. Examples of such tasks include (semantic) anomaly detection and next activity prediction, which both involve considerations of the meaning of activities and their inter-relations. In this paper, we investigate the capabilities of LLMs to tackle such semantics-aware process mining tasks. Furthermore, whereas most works on the intersection of LLMs and process mining only focus on testing these models out of the box, we provide a more principled investigation of the utility of LLMs for process mining, including their ability to obtain process mining knowledge post-hoc by means of in-context learning and supervised fine-tuning. Concretely, we define three process mining tasks that benefit from an understanding of process semantics and provide extensive benchmarking datasets for each of them. Our evaluation experiments reveal that (1) LLMs fail to solve challenging process mining tasks out of the box and when provided only a handful of in-context examples, (2) but they yield strong performance when fine-tuned for these tasks, consistently surpassing smaller, encoder-based LLMs.
- A. Berti, D. Schuster, and W. M. van der Aalst, “Abstractions, scenarios, and prompt definitions for process mining with llms: A case study,” in BPM. Springer, 2023, pp. 427–439.
- U. Jessen, M. Sroka, and D. Fahland, “Chit-chat or deep talk: Prompt engineering for process mining,” arXiv preprint arXiv:2307.09909, 2023.
- B. Estrada-Torres, A. del Río-Ortega, and M. Resinas, “Mapping the landscape: Exploring large language model applications in business process management,” in BPMDS. Springer, 2024, pp. 22–31.
- H. van der Aa, A. Rebmann, and H. Leopold, “Natural language-based detection of semantic execution anomalies in event logs,” Information Systems, vol. 102, p. 101824, 2021.
- D. A. Neu, J. Lahann, and P. Fettke, “A systematic literature review on state-of-the-art deep learning methods for process prediction,” Artificial Intelligence Review, vol. 55, no. 2, pp. 801–827, 2022.
- J. Caspary, A. Rebmann, and H. van der Aa, “Does this make sense? machine learning-based detection of semantic anomalies in business processes,” in BPM. Springer, 2023, pp. 163–179.
- P. Rajpurkar, J. Zhang, K. Lopyrev, and P. Liang, “Squad: 100,000+ questions for machine comprehension of text,” arXiv preprint arXiv:1606.05250, 2016.
- S. Narayan, S. B. Cohen, and M. Lapata, “Don’t give me the details, just the summary! topic-aware convolutional neural networks for extreme summarization,” arXiv preprint arXiv:1808.08745, 2018.
- W. M. P. van der Aalst and A. K. A. de Medeiros, “Process mining and security: Detecting anomalous process executions and checking process conformance,” Electronic Notes in Theoretical Computer Science, vol. 121, pp. 3–21, 2005.
- F. Bezerra and J. Wainer, “Algorithms for anomaly detection of traces in logs of process aware information systems,” Information Systems, vol. 38, no. 1, pp. 33–44, 2013.
- J. Evermann, J.-R. Rehse, and P. Fettke, “Predicting process behaviour using deep learning,” Dec. Support Systems, vol. 100, pp. 129–140, 2017.
- P. Pfeiffer, J. Lahann, and P. Fettke, “Multivariate business process representation learning utilizing gramian angular fields and convolutional neural networks,” in BPM. Springer, 2021, pp. 327–344.
- A. Rebmann, F. D. Schmidt, G. Glavaš, and H. van der Aa, “Process behavior corpus and benchmarking datasets,” May 2024. [Online]. Available: https://doi.org/10.5281/zenodo.11276246
- D. Sola, C. Warmuth, B. Schäfer, P. Badakhshan, J.-R. Rehse, and T. Kampik, “SAP Signavio academic models: A large process model dataset,” in ICPM Workshops. Springer, 2023, pp. 453–465.
- A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in neural information processing systems, vol. 30, 2017.
- J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,” in NAACL. ACL, 2019, pp. 4171–4186.
- Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, and V. Stoyanov, “Roberta: A robustly optimized bert pretraining approach,” arXiv preprint arXiv:1907.11692, 2019.
- T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell et al., “Language models are few-shot learners,” Advances in neural information processing systems, vol. 33, pp. 1877–1901, 2020.
- H. Touvron, T. Lavril, G. Izacard, X. Martinet, M.-A. Lachaux, T. Lacroix, B. Rozière, N. Goyal, E. Hambro, F. Azhar et al., “Llama: Open and efficient foundation language models,” arXiv preprint arXiv:2302.13971, 2023.
- Y. Wang, S. Mishra, P. Alipoormolabashi, Y. Kordi, A. Mirzaei, A. Arunkumar, A. Ashok, A. S. Dhanasekaran, A. Naik, D. Stap et al., “Super-naturalinstructions: Generalization via declarative instructions on 1600+ nlp tasks,” in EMNLP, 2022.
- S. Zhang, L. Dong, X. Li, S. Zhang, X. Sun, S. Wang, J. Li, R. Hu, T. Zhang, F. Wu et al., “Instruction tuning for large language models: A survey,” arXiv preprint arXiv:2308.10792, 2023.
- E. J. Hu, P. Wallis, Z. Allen-Zhu, Y. Li, S. Wang, L. Wang, W. Chen et al., “Lora: Low-rank adaptation of large language models,” in International Conference on Learning Representations, 2021.
- Q. Dong, L. Li, D. Dai, C. Zheng, Z. Wu, B. Chang, X. Sun, J. Xu, and Z. Sui, “A survey on in-context learning,” arXiv preprint arXiv:2301.00234, 2022.
- I. Loshchilov and F. Hutter, “Decoupled weight decay regularization,” in International Conference on Learning Representations, 2018.
- A. Rebmann and H. van der Aa, “Enabling semantics-aware process mining through the automatic annotation of event logs,” Inf. Syst., vol. 110, p. 102111, 2022.
- C. Kecht, A. Egger, W. Kratsch, and M. Röglinger, “Event log construction from customer service conversations using natural language inference,” in ICPM. IEEE, 2021, pp. 144–151.
- M. Grohs, L. Abb, N. Elsayed, and J.-R. Rehse, “Large language models can accomplish business process management tasks,” in BPM. Springer, 2023, pp. 453–465.
- H. Kourani, A. Berti, D. Schuster, and W. M. van der Aalst, “Process modeling with large language models,” arXiv preprint arXiv:2403.07541, 2024.
- A. Berti, H. Kourani, H. Häfke, C.-Y. Li, and D. Schuster, “Evaluating large language models in process mining: Capabilities, benchmarks, and evaluation strategies,” in BPMDS. Springer, 2024, pp. 13–21.
- A. Panickssery, S. R. Bowman, and S. Feng, “Llm evaluators recognize and favor their own generations,” arXiv preprint arXiv:2404.13076, 2024.
- Adrian Rebmann (8 papers)
- Fabian David Schmidt (11 papers)
- Goran Glavaš (82 papers)
- Han van der Aa (20 papers)