- The paper introduces a domain-aware process discovery framework that translates textual process knowledge into formal constraints using LLMs.
- It employs prompt engineering to extract declarative rules from natural language, refining process models under the Inductive Miner Recursive framework.
- A case study at UWV demonstrates that integrating domain insights improves model fidelity and conformance checking compared to traditional methods.
Bridging Domain Knowledge and Process Discovery Using LLMs
This paper presents a framework that addresses the integration of domain knowledge into process discovery by leveraging LLMs. The authors identify the challenge of existing automated process discovery methods, which often neglect the incorporation of valuable domain expertise and documentation that are typically expressed in natural language. By utilizing LLMs, this research aims to bridge the gap between textual process knowledge and the discovery of process models, proposing a methodology that ensures alignment between domain insights and process execution data.
Summary of the Paper
The core innovation of the paper lies in the utilization of LLMs to translate textual domain knowledge into declarative constraints, which can guide the process discovery phase. The framework operates under the Inductive Miner Recursive (IMr) framework, which is designed to incrementally construct process models from event logs while pruning suboptimal structures based on the imposed rules derived from domain knowledge. The authors illustrate the feasibility of this approach through a case paper with the UWV employee insurance agency.
Key Contributions
- Domain-Aware Process Discovery Framework: The framework enables the encoding of process knowledge into declarative constraints that inform the model discovery process. This approach enhances traditional process discovery methods, which generally depend solely on event log data.
- Rule Extraction via LLMs: Through role-promoting and prompt engineering techniques, LLMs are employed to parse and interpret process-related natural language texts into a formal set of rules. These rules can then be used to refine process models.
- Case Study Implementation: The frameworkâs applicability is demonstrated in a real-life scenario at the UWV agency, showcasing improvements in process model fidelity compared to models generated without domain knowledge incorporation.
Implications for Process Mining
Theoretical and practical implications are evident. Theoretically, this research advances the process mining field by integrating natural language processing capabilities, enhancing the encoding of process knowledge into a machine-interpretable format. Practically, the framework offers a pathway for organizations to harness existing domain expertise effectively in improving process model discovery, leading to more accurate conformance checking and process improvement activities.
Future Directions
The paper opens several avenues for further research. There is potential to expand the range of declarative constraints supported by the IMr framework, as well as to refine LLM prompting strategies to improve the accuracy of rule extraction. Additionally, future work may explore the integration of real-time feedback loops from domain experts, enhancing the interactivity and adaptability of process model discovery through continuous learning paradigms.
Conclusion
This research highlights a novel approach to process discovery by synchronizing domain knowledge with event log data, using LLMs as a bridge. By incorporating structured domain input into process models, organizations can achieve a more robust alignment of discovered models with real-world processes, offering benefits in both process accuracy and alignment with organizational objectives. This innovation marks a significant step toward more intelligent and integrated process management solutions, combining the strengths of human expertise with advanced machine learning capabilities.