Object-Centric Process Mining
- Object-centric process mining is an advanced paradigm that models and analyzes multiple interacting business objects using OCEL 2.0-based event data.
- It employs techniques like Object-Centric Directly-Follows Graphs and Petri Nets to capture inter-object relationships and complex process dynamics.
- Innovative algorithms for discovery, conformance, and performance analysis enhance scalability and interpretability for multi-dimensional process logs.
Object-centric process mining (OCPM) is an advanced process mining paradigm that models, discovers, and analyzes the intertwined dynamics of multiple interacting business objects within event data. Unlike traditional case-centric approaches, object-centric process mining operates directly on object-centric event data (OCED), which reflects the true multi-object nature of operational processes as captured by formats such as OCEL 2.0. By capturing event-object and object-object relationships natively, OCPM enables the analysis of complex, interrelated behaviors that span multiple functional units and value chain segments, and supports both fine-grained and aggregated process insights (Koren et al., 4 Mar 2024, Berti et al., 2023, Khayatbashi et al., 26 Aug 2025).
1. Formal Foundations: Object-Centric Event Data and Log Structures
The foundational data structure in OCPM is the object-centric event log, formalized as a tuple
where:
- is a finite set of events and is a finite set of objects (with ).
- and are the sets of event- and object-attribute names.
- Attribute functions specify event and object types, timestamps, and qualifiers.
- encodes event-to-object links with qualifiers (e.g., "created", "used").
- describes explicit object-to-object relations (e.g., "contains").
The dominant logging standard, OCEL 2.0, supports dynamic object attributes, qualified object relations, and temporally evolving object states (Koren et al., 4 Mar 2024, Goossens et al., 21 May 2024).
Events in OCED can reference multiple objects of varying types with arbitrary cardinality. Cardinality constraints and qualifiers enforce traceability, e.g., every event relates to at least one object, and objects may be involved in multiple events or with multiple others.
2. Fundamental Modeling Approaches and Expressive Power
OCPM extends classical process modeling by embracing several high-expressivity paradigms for handling multi-object interactions:
- Object-Centric Directly-Follows Graphs (OC-DFGs): Typed multigraphs where each arc is colored by the object type witnessing a directly-follows relation. This structure supports the identification of multi-type control flows unrealizable in single-case models (Berti et al., 2022, Berti et al., 2023).
- Object-Centric Petri Nets (OCPNs): These feature typed places holding object-identifying tokens and transitions that synchronize multiple objects. Variable arcs model cardinality flexibility (e.g., one-to-many object participation). The semantics ensures correct firing conditions over bindings of typed objects (Seidel et al., 18 Aug 2025, Berti et al., 2022).
- Petri Nets with Identifiers (OPIDs): Tokens are tuples of object identifiers, allowing representation of explicit inter-object relationships and supporting synchronization constraints such as stable many-to-one rigidities (Seidel et al., 18 Aug 2025).
- Declarative Artifact and Behavioral Constraint Models: Declarative rules over object-activity pairs and relations (e.g., object-centric behavioral constraints) formalize temporal or relational requirements between different object classes (Berti et al., 2023).
The ability to represent explicit object-to-object synchronization, relationship constraints, and evolving object states is central for process discovery and conformance analytics that capture the bounding behavior and inter-object dependencies of real processes.
3. Scope Definition, Aggregation, and Multi-Level Analysis
Object-centric event data often encompasses multiple, interrelated processes without explicit process boundaries. Existing event log formats lack direct representation of process "scopes," impeding multi-level analytics. Analysts can address this by defining process scopes as first-class objects of type "process" within the OCEL. A scope is formally specified as: where (events in scope) and (objects in scope) are attached via dedicated qualifiers (Khayatbashi et al., 26 Aug 2025).
Scoping is achieved by an analyst-authored enrichment ruleset (using a domain-specific language) specifying inclusion and exclusion conditions over event/object attributes and types. The embedding function
maps an OCEL and a set of scope rulesets to a scope-enriched OCEL, in which each new scope is an object linked to its constituent events and objects (Khayatbashi et al., 26 Aug 2025).
This mechanism enables:
- Intra-scope analysis: Scope-specific sublogs support traditional OCPM operations (e.g., process discovery, compliance checking) confined to the scope.
- Inter-scope analysis: Construction of a directed process interaction graph , where edges represent shared-object handovers across process scopes (edge means a shared object of type links and ).
- Multi-level drill-down/roll-up: Analysts may define nested or hierarchical scopes, enabling aggregation to higher-level processes or detailed drill-down into subprocesses (Khayatbashi et al., 26 Aug 2025, Khayatbashi et al., 30 Nov 2024).
This multi-level structuring aligns analysis with real organizational perspectives (e.g., business-unit vs. operational roles) and supports agile "what-if" rescoping without re-exporting raw data.
4. Key Algorithms: Discovery, Conformance, and Performance Analysis
4.1. Process Discovery
The dominant workflow proceeds as follows:
- Flattening per object type: For each object type, extract the sublog of events referencing at least one object of that type, producing for .
- Sub-discovery: Apply standard process discovery algorithms (Inductive Miner, α-miner) to each .
- Collation: Merge the resulting per-type models into an OC-DFG or OCPN, decorating arcs or places by object type and synchronizing transitions as appropriate (Berti et al., 2023, Berti et al., 2022).
4.2. Conformance Checking
OCPM requires new definitions of fitness and precision to account for multi-object synchronization. Following (Adams et al., 2021):
where and are the set of activities enabled in the log/model after the context of .
4.3. Performance Analysis
Distinct OCPM-specific time metrics can be computed via token-based replay on OCPNs:
- Synchronization time: for the set of related token visits .
- Pooling time and lagging time capture gathering delays for object groups, exposing interaction inefficiencies (e.g., waiting for all items before an order is shipped) (Park et al., 2022).
5. Granularity Operations, Clustering, and Local Modeling
Large-scale or heterogeneous OCELs often lead to complex or "spaghetti" models. Several techniques have been developed to enhance interpretability and adapt granularity:
- Granularity Operations: Four reversible operations—drill-down, roll-up, unfold, fold—on object and event types support dynamic adjustment between detailed and coarse process views, aiding zoom-in/zoom-out during discovery and supporting segmentation by object-attribute slices (Khayatbashi et al., 30 Nov 2024).
- Clustering Techniques: Object-centric clustering groups similar objects (e.g., via profile vectors incorporating traces, graph metrics, and attributes) using distance measures (edit, Euclidean, categorical) and standard clustering algorithms (k-means, hierarchical). Results demonstrate drastic reductions in OC-DFG model complexity while maintaining or improving discovery fitness (Ghahfarokhi et al., 2022, Jalali, 2022).
- Object-Centric Local Process Models (OCLPMs): Algorithmic discovery of frequently recurring multi-object behavioral patterns, realized as OCPN fragments, facilitates focused analysis and pattern mining across highly entangled logs (Peeva et al., 4 Nov 2024).
6. Tooling, Storage, and Data Engineering
Object-centric process mining is supported by a growing ecosystem of open-source tools:
- OCEL 2.0 Resources: Formal specification, example logs, and library support for OCEL 2.0 are consolidated at (Koren et al., 4 Mar 2024).
- Analysis Frameworks: Major platforms include OC-PM (web/ProM), ocpa (Python), PM4Py-MDL, and application-specific tools (e.g., local model discovery plugins for ProM) (Berti et al., 2022, Peeva et al., 4 Nov 2024, Khayatbashi et al., 30 Nov 2024).
- Storage Architectures: Scalable storage is enabled by mapping OCEL to document-oriented databases (e.g., MongoDB), supporting aggregation pipelines for directly-follows discovery and lifecycle extraction on logs with tens of millions of events (Berti et al., 2022). More recently, relational hub-and-spoke architectures with process-agnostic 3NF schemas have been advocated for high-frequency, streaming, and schema-evolving environments (Bosmans et al., 1 Oct 2024).
- Data extraction methodologies: OCPM² extends the PM² methodology for systematic OCED extraction, emphasizing conceptual modeling, extraction matrices, automated verification, and iterative improvement—crucial for reproducible analysis (Miri et al., 13 Mar 2025).
7. Challenges, Limitations, and Research Directions
Despite rapid methodological and tooling advances, several challenges persist:
- Process Scope Definition: Automated detection of meaningful scopes remains unsolved; current solutions depend on manual rulesets or future clustering over object-object motifs (Khayatbashi et al., 26 Aug 2025).
- Relationship Semantics and Synchronization: Standard OCPNs do not enforce intended object relationships, necessitating mappings to OPIDs for explicit synchronization (e.g., stable many-to-one bindings) to avoid underspecification (Seidel et al., 18 Aug 2025).
- Scalability, Streaming, and Data Evolution: Handling massive, fast-evolving logs with unstructured data (e.g., email, IoT) or high schema change rates is an open area for data engineering and query optimization research (Bosmans et al., 1 Oct 2024).
- Model Quality and Benchmarks: Generalized quality metrics (fitness/precision) exist, but large-scale, multi-object benchmarks and gold standards are still required for empirical comparison (Adams et al., 2021, Goossens et al., 21 May 2024).
- Tool Interoperability and Standardization: The coexistence of multiple OCED/OCEL variants and limitations in schema evolution/type inheritance hinder cross-tool compatibility; specification convergence is recommended (Goossens et al., 21 May 2024, Koren et al., 4 Mar 2024).
Emerging trends include: extension of enrichment languages for scopes (temporal/pattern constraints), semi-automated scope suggestion, streaming-ready object-centric repositories, and the integration of knowledge-graph or machine learning techniques for variant mining and predictive analytics.
References:
- (Khayatbashi et al., 26 Aug 2025) Enriching Object-Centric Event Data with Process Scopes: A Framework for Aggregation and Analysis
- (Seidel et al., 18 Aug 2025) To bind or not to bind? Discovering Stable Relationships in Object-centric Processes (Extended Version)
- (Berti et al., 2023) Advancements and Challenges in Object-Centric Process Mining: A Systematic Literature Review
- (Koren et al., 4 Mar 2024) OCEL 2.0 Resources -- www.ocel-standard.org
- (Khayatbashi et al., 30 Nov 2024) Advancing Object-Centric Process Mining with Multi-Dimensional Data Operations
- (Goossens et al., 21 May 2024) Object-Centric Event Logs: Specifications, Comparative Analysis and Refinement
- (Ghahfarokhi et al., 2022) Clustering Object-Centric Event Logs
- (Peeva et al., 4 Nov 2024) Object-Centric Local Process Models
- (Miri et al., 13 Mar 2025) OCPM: Extending the Process Mining Methodology for Object-Centric Event Data Extraction
- (Berti et al., 2022) OC-PM: Analyzing Object-Centric Event Logs and Process Models
- (Adams et al., 2021) Precision and Fitness in Object-Centric Process Mining
- (Berti et al., 2022) A Scalable Database for the Storage of Object-Centric Event Logs
- (Bosmans et al., 1 Oct 2024) Dynamic and Scalable Data Preparation for Object-Centric Process Mining
- (Park et al., 2023) Analyzing An After-Sales Service Process Using Object-Centric Process Mining: A Case Study
- (Jalali, 2022) Object Type Clustering using Markov Directly-Follow Multigraph in Object-Centric Process Mining
- (Adams et al., 2022) Defining Cases and Variants for Object-Centric Event Data
- (Park et al., 2022) OPerA: Object-Centric Performance Analysis
- (Ghahfarokhi et al., 2021) Process Comparison Using Object-Centric Process Cubes