Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Tailoring Machine Learning for Process Mining (2306.10341v1)

Published 17 Jun 2023 in cs.LG, cs.AI, and cs.DB

Abstract: Machine learning models are routinely integrated into process mining pipelines to carry out tasks like data transformation, noise reduction, anomaly detection, classification, and prediction. Often, the design of such models is based on some ad-hoc assumptions about the corresponding data distributions, which are not necessarily in accordance with the non-parametric distributions typically observed with process data. Moreover, the learning procedure they follow ignores the constraints concurrency imposes to process data. Data encoding is a key element to smooth the mismatch between these assumptions but its potential is poorly exploited. In this paper, we argue that a deeper insight into the issues raised by training machine learning models with process data is crucial to ground a sound integration of process mining and machine learning. Our analysis of such issues is aimed at laying the foundation for a methodology aimed at correctly aligning machine learning with process mining requirements and stimulating the research to elaborate in this direction.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (95)
  1. Deloitte. Global process mining survey. Technical report, Deloitte, 2021.
  2. Process mining manifesto. In International conference on business process management, pages 169–194. Springer, 2011.
  3. Wil Van der Aalst and Ernesto Damiani. Processes meet big data: Connecting data science with process science. IEEE Transactions on Services Computing, 8(6):810–819, 2015.
  4. Event abstraction for process mining using supervised learning techniques. In Proceedings of SAI Intelligent Systems Conference, pages 251–269. Springer, 2016.
  5. Event abstraction in process mining: literature review and taxonomy. Granular Computing, 6(3):719–736, 2021.
  6. Machine learning-based framework for log-lifting in business process mining applications. In International Conference on Business Process Management, pages 232–249. Springer, 2019.
  7. Trace clustering in process mining. In International conference on business process management, pages 109–120. Springer, 2008.
  8. RP Jagadeesh Chandra Bose and Wil MP Van der Aalst. Context aware trace clustering: Towards improving process mining results. In proceedings of the 2009 SIAM International Conference on Data Mining, pages 401–412. SIAM, 2009.
  9. A co-training strategy for multiple view clustering in process mining. IEEE transactions on services computing, 9(6):832–845, 2015.
  10. Selecting optimal trace clustering pipelines with meta-learning. In Brazilian Conference on Intelligent Systems, pages 150–164. Springer, 2022.
  11. A framework for estimating simplicity of automatically discovered process models based on structural and behavioral characteristics. In International Conference on Business Process Management, pages 129–146. Springer, 2020.
  12. To aggregate or to eliminate? optimal model simplification for improved process performance prediction. Information Systems, 78:96–111, 2018.
  13. Simplification of complex process models by abstracting infrequent behaviour. In International Conference on Service-Oriented Computing, pages 415–430. Springer, 2019.
  14. Analysis of alarms to prevent the organizations network in real-time using process mining approach. Cluster Computing, 22(3):7023–7030, 2019.
  15. Overlapping analytic stages in online process mining. In 2019 IEEE International Conference on Services Computing (SCC), pages 167–175. IEEE, 2019.
  16. Evaluation goals for online process mining: a concept drift perspective. IEEE Transactions on Services Computing, 2020.
  17. Activity prediction of business process instances with inception cnn models. In Mario Alviano, Gianluigi Greco, and Francesco Scarcello, editors, AI*IA 2019 – Advances in Artificial Intelligence, pages 348–361, Cham, 2019. Springer International Publishing.
  18. Predictive process mining meets computer vision. In International Conference on Business Process Management, pages 176–192. Springer, 2020.
  19. Predictive monitoring of business processes: a survey. IEEE Transactions on Services Computing, 11(6):962–977, 2017.
  20. Process mining meets causal machine learning: Discovering causal rules from event logs. In 2020 2nd International Conference on Process Mining (ICPM), pages 129–136. IEEE, 2020.
  21. A graph-based approach to interpreting recurrent neural networks in process mining. IEEE Access, 8:172923–172938, 2020.
  22. Tpb Wiel. Process mining using integer linear programming. 2010.
  23. Using machine learning in business process re-engineering. Big Data and Cognitive Computing, 5(4):61, 2021.
  24. Wil van der Aalst. Academic view: Development of the process mining discipline. In Process Mining in Action, pages 181–196. Springer, 2020.
  25. The proactive insights engine: Process mining meets machine learning and artificial intelligence. In BPM (Demos), 2017.
  26. Wil MP van der Aalst. On the pareto principle in process mining, task mining, and robotic process automation. In DATA, pages 5–12, 2020.
  27. Evaluating trace encoding methods in process mining. In International Symposium: From Data to Models and Back, pages 174–189. Springer, 2020.
  28. One model to rule them all: towards zero-shot learning for databases. arXiv preprint arXiv:2105.00642, 2021.
  29. G MARQUES TAVARES et al. Meta learning in process mining: Toward a systematic approach to design data analytics pipelines with event logs. 2023.
  30. Deep learning for predictive business process monitoring: Review and benchmark. IEEE Transactions on Services Computing, pages 1–1, 2021.
  31. Applying sequence mining for outlier detection in process mining. In OTM Confederated International Conferences" On the Move to Meaningful Internet Systems", pages 98–116. Springer, 2018.
  32. Repairing outlier behaviour in event logs using contextual behaviour.
  33. An anti-noise process mining algorithm based on minimum spanning tree clustering. IEEE Access, 6:48756–48764, 2018.
  34. Filtering out noise logs for process modelling based on event dependency. In 2019 IEEE International Conference on Web Services (ICWS), pages 388–392. IEEE, 2019.
  35. A data quality framework for process mining of electronic health record data. In 2018 IEEE International Conference on Healthcare Informatics (ICHI), pages 12–21. IEEE, 2018.
  36. Alessandro Berti. Statistical sampling in process mining discovery. In The 9th international conference on information, process, and knowledge management, pages 41–43, 2017.
  37. M. De Leoni and F. Mannhardt. Road traffic fine management process, 2015.
  38. J.C.A.M. Buijs. Receipt phase of an environmental permit application process, 2014.
  39. B.F. van Dongen. Bpi challenge 2015 municipality 1, 2015.
  40. Te-Won Lee. Independent component analysis. In Independent component analysis, pages 27–66. Springer, 1998.
  41. Remi M Sakia. The box-cox transformation technique: a review. Journal of the Royal Statistical Society: Series D (The Statistician), 41(2):169–178, 1992.
  42. Efficient online evaluation of big data stream classifiers. In Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining, pages 59–68, 2015.
  43. An alternative approach to dimension reduction for pareto distributed data: a case study. Journal of big Data, 8(1):1–23, 2021.
  44. Combining the ensemble kalman filter with markov chain monte carlo for improved history matching and uncertainty characterization. In SPE Reservoir Simulation Symposium. OnePetro, 2011.
  45. A systematic literature review on state-of-the-art deep learning methods for process prediction. Artificial Intelligence Review, pages 1–27, 2021.
  46. Wil MP van der Aalst. Concurrency and objects matter! disentangling the fabric of real operational processes to create digital twins. In International Colloquium on Theoretical Aspects of Computing, pages 3–17. Springer, 2021.
  47. An eye into the future: leveraging a-priori knowledge in predictive business process monitoring. In Business Process Management: 15th International Conference, BPM 2017, Barcelona, Spain, September 10–15, 2017, Proceedings 15, pages 252–268. Springer, 2017.
  48. A deep machine learning method for concurrent and interleaved human activity recognition. Sensors, 20(20):5770, 2020.
  49. Adapted long short-term memory (lstm) for concurrent human activity recognition. Computers, Materials & Continua, 69(2), 2021.
  50. A systematic literature review on state-of-the-art deep learning methods for process prediction. Artificial Intelligence Review, pages 1–27, 2022.
  51. Ensemble learning for data stream analysis: A survey. Information Fusion, 37:132–156, 2017.
  52. Handling concept drift for predictions in business process mining. In 2020 IEEE 22nd Conference on Business Informatics (CBI), volume 1, pages 76–83. IEEE, 2020.
  53. Leveraging small sample learning for business process management. Information and Software Technology, 132:106472, 2021.
  54. Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 701–710, 2014.
  55. node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, pages 855–864, 2016.
  56. Deep graph infomax. ICLR (Poster), 2(3):4, 2019.
  57. Infograph: Unsupervised and semi-supervised graph-level representation learning via mutual information maximization. arXiv preprint arXiv:1908.01000, 2019.
  58. Variational autoencoder for anomaly detection in event data in online process mining. In Proceedings of the 23rd International Conference on Enterprise Information Systems, volume 1, pages 567–574, 2021.
  59. Generative modeling by estimating gradients of the data distribution. Advances in Neural Information Processing Systems, 32, 2019.
  60. Low-rank covariance function estimation for multidimensional functional data. Journal of the American Statistical Association, 117(538):809–822, 2022.
  61. Trace encoding in process mining: a survey and benchmarking. arXiv preprint arXiv:2301.02167, 2023.
  62. Outcome-oriented predictive process monitoring: review and benchmark. ACM Transactions on Knowledge Discovery from Data (TKDD), 13(2):1–57, 2019.
  63. act2vec, trace2vec, log2vec, and model2vec: Representation learning for business processes. In Mathias Weske, Marco Montali, Ingo Weber, and Jan vom Brocke, editors, Business Process Management (BPM), volume 11080 of Lecture Notes in Computer Science, pages 305–321. Springer, 2018.
  64. Predictive business process monitoring with LSTM neural networks. In Eric Dubois and Klaus Pohl, editors, Conference on Advanced Information Systems Engineering (CAiSE), volume 10253 of Lecture Notes in Computer Science, pages 477–492. Springer, 2017.
  65. Toward a new generation of log pre-processing methods for process mining. In International Conference on Business Process Management, pages 55–70. Springer, 2017.
  66. Fundamentals of Predictive Text Mining, Second Edition. Texts in Computer Science. Springer, 2015.
  67. Distributed representations of sentences and documents. In Proceedings of the 31st International Conference on International Conference on Machine Learning - Volume 32, ICML’14, page II–1188–II–1196. JMLR.org, 2014.
  68. Node2vec: Scalable feature learning for networks. In International Conference on Knowledge Discovery and Data Mining (SIGKDD), KDD ’16, page 855–864, New York, NY, USA, 2016. Association for Computing Machinery.
  69. Don’t walk, skip! online learning of multi-scale network embeddings. In International Conference on Advances in Social Networks Analysis and Mining (ASONAM), ASONAM ’17, page 258–265, New York, NY, USA, 2017. Association for Computing Machinery.
  70. Advances in Data Management in the Big Data Era, pages 99–126. Springer International Publishing, Cham, 2021.
  71. Analysis of language inspired trace representation for anomaly detection. In ADBIS, TPDL and EDA 2020 Common Workshops and Doctoral Consortium, pages 296–308. Springer, 2020.
  72. Embedding process structure in activities for process mapping and comparison. In Silvia Chiusano, Tania Cerquitelli, Robert Wrembel, Kjetil Nørvåg, Barbara Catania, Genoveva Vargas-Solar, and Ester Zumpano, editors, New Trends in Database and Information Systems (ADBIS), volume 1652, pages 119–129. Springer, 2022.
  73. Encoding high-level control-flow construct information for process outcome prediction. In 2022 4th International Conference on Process Mining (ICPM), pages 48–55. IEEE, 2022.
  74. A multi-view deep learning approach for predictive business process monitoring. IEEE Transactions on Services Computing, 2021.
  75. Learning accurate lstm models of business processes. In International Conference on Business Process Management, pages 286–302. Springer, 2019.
  76. Intra and inter-case features in predictive process monitoring: A tale of two dimensions. In Josep Carmona, Gregor Engels, and Akhil Kumar, editors, Business Process Management (BPM), volume 10445 of Lecture Notes in Computer Science, pages 306–323. Springer, 2017.
  77. Massimiliano de Leoni and Wil M. P. van der Aalst. Data-aware process mining: discovering decisions in processes using alignments. In Sung Y. Shin and José Carlos Maldonado, editors, Symposium on Applied Computing (SAC), pages 1454–1461. ACM, 2013.
  78. Decision mining revisited - discovering overlapping rules. In Selmin Nurcan, Pnina Soffer, Marko Bajec, and Johann Eder, editors, Advanced Information Systems Engineering, pages 377–392, Cham, 2016. Springer International Publishing.
  79. Process mining with the heuristics miner-algorithm. Technische Universiteit Eindhoven, Tech. Rep. WP, 166(July 2017):1–34, 2006.
  80. Process mining in healthcare tutorial, 2020.
  81. Anomaly detection on event logs with a scarcity of labels. In 2020 2nd International Conference on Process Mining (ICPM), pages 161–168. IEEE, 2020.
  82. Mahnaz Sadat Qafari and Wil van der Aalst. Root cause analysis in process mining using structural equation models. In International Conference on Business Process Management, pages 155–167. Springer, 2020.
  83. Multi-kernel gaussian processes. In Twenty-second international joint conference on artificial intelligence, 2011.
  84. Laurens Van der Maaten and Geoffrey Hinton. Visualizing data using t-sne. Journal of machine learning research, 9(11), 2008.
  85. Adversarial system variant approximation to quantify process model generalization. IEEE Access, 8:194410–194427, 2020.
  86. Predictive business process monitoring via generative adversarial nets: the case of next event prediction. In International Conference on Business Process Management, pages 237–256. Springer, 2020.
  87. Learning to optimize under non-stationarity. In The 22nd International Conference on Artificial Intelligence and Statistics, pages 1079–1087. PMLR, 2019.
  88. Handling concept drift in process mining. In International Conference on Advanced Information Systems Engineering, pages 391–405. Springer, 2011.
  89. Online techniques for dealing with concept drift in process mining. In International Symposium on Intelligent Data Analysis, pages 90–102. Springer, 2012.
  90. Mining high-speed data streams. In Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 71–80, 2000.
  91. A survey of zero-shot learning: Settings, methods, and applications. ACM Transactions on Intelligent Systems and Technology (TIST), 10(2):1–37, 2019.
  92. Multi-modality adversarial auto-encoder for zero-shot learning. IEEE Access, 8:9287–9295, 2019.
  93. Product line configuration meets process mining. Procedia Computer Science, 164:199–210, 2019.
  94. Correction networks: Meta-learning for zero-shot learning, 2019.
  95. Process mining encoding via meta-learning for an enhanced anomaly detection. In European Conference on Advances in Databases and Information Systems, pages 157–168. Springer, 2021.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Paolo Ceravolo (13 papers)
  2. Sylvio Barbon Junior (10 papers)
  3. Ernesto Damiani (33 papers)
  4. Wil van der Aalst (31 papers)
Citations (2)

Summary

We haven't generated a summary for this paper yet.