ExOAR: Expert-Guided Object & Activity Recognition
- ExOAR is an interactive framework that integrates large language models with expert validation to generate semantically rich, event-centric logs from text.
- The method employs a staged pipeline—from candidate generation to iterative expert refinement—ensuring accurate mapping of object types, instances, and activities.
- ExOAR bridges the gap between unstructured textual data and process mining requirements, enhancing reliability in object-centric analysis.
ExOAR (Expert-Guided Object and Activity Recognition) is an interactive framework designed to facilitate the identification and structuring of objects and activities from unstructured textual data, in direct response to the challenges faced in object-centric process mining workflows. By integrating LLMs with expert validation, ExOAR enables the generation and refinement of object types, object instances, and activity classes, thus supporting the conversion of raw text into semantically rich, event-centric logs essential for high-fidelity process analysis (Beerepoot et al., 3 Dec 2025).
1. Semantic Gap in Object-Centric Process Mining
Object-centric process mining (OCPM) requires event logs with explicit links between each event and one or more object types (such as “invoice,” “customer,” “shipment”) and their respective object instances (such as “INV-1234,” “Acme Corp.,” “SHIP-5678”). Traditional approaches depend on structured data sources; however, most information in practical domains is available only as unstructured text, making the identification of relevant event–object relationships challenging. This semantic gap impedes effective analysis, trace discovery, and conformance checking.
A plausible implication is that automated solutions are insufficiently reliable to create full semantic event logs from text without human intervention, owing to domain intricacies and ambiguous language.
2. ExOAR Framework Overview
ExOAR operationalizes expert-guided recognition by combining machine intelligence and human expertise across a staged pipeline. The workflow can be formalized as follows:
Let denote a corpus of unstructured textual data, and represent contextual information, such as the profession of the user.
ExOAR executes:
- Stage 1 — Candidate Generation: An LLM processes to generate lists of plausible object types, of activity types, and of object instances.
- Stage 2 — Expert Review: Human experts vet candidate sets , discarding irrelevant entries and refining ambiguous ones.
- Stage 3 — Iterative Refinement: The process may recurse, invoking additional LLM suggestions informed by prior expert input.
This approach maintains both flexibility and human oversight, ensuring structured log generation aligns with domain conventions and semantic requirements (Beerepoot et al., 3 Dec 2025).
3. Integration of LLMs with Human Validation
Central to ExOAR is the use of LLMs for initial candidate extraction. The LLM utilizes contextual priors—incorporating the user's profession or task—to tailor candidate detection. Despite advances in LLM semantic extraction, ExOAR enforces mandatory expert intervention at each decisive stage. Expert review serves to:
- Disambiguate object and activity candidates.
- Refine object instance mappings.
- Uphold domain-specific semantics.
This suggests that the method leverages the contextual prowess of LLMs but circumscribes their final authority in favor of expert judgement, enhancing reliability for OCPM alignment.
4. Implementation and Validation
ExOAR is deployed as a practical toolkit, supporting interactive review and refinement. Its initial validation utilized a demonstration scenario and proceeded to empirical evaluation with Active Window Tracking data from five users. The evaluation investigated whether ExOAR could construct structured logs that retain clear semantics suitable for object-centric process analysis, starting from naturally occurring activity traces recorded in textual form.
A plausible implication is that the framework may generalize to other domains where textual workflows predominate; however, the presented evaluation is strictly limited to the provided dataset and tool scenario.
5. Contribution to Object-Centric Process Analysis
By bridging the gap between unstructured textual records and structured semantic event logs, ExOAR supports:
- Explicit mapping of activity occurrences to object instances and types.
- Log generation with process mining–ready structure, enabling trace-based or graph-based analysis modalities.
- Enhanced flexibility compared to rigid rule-based extraction schemes, while retaining auditability and expert oversight.
The significance is primarily methodological: ExOAR advances the state-of-the-art in process mining by accommodating domains that lack fully structured input, without sacrificing interpretability or domain congruence (Beerepoot et al., 3 Dec 2025).
6. Limitations and Future Directions
ExOAR is validated in a constrained application domain (Active Window Tracking data, five users). Its effectiveness in broader, more complex settings remains an open question. The framework presupposes expert availability for review and semantic refinement, which may not scale in environments with high event volume or limited expert bandwidth.
A plausible implication is that ongoing research might explore semi-automated or confidence-weighted review processes or further task-specific fine-tuning of LLM extraction routines, subject to the requirement for semantic fidelity and interpretability in process mining workflows.