Papers
Topics
Authors
Recent
2000 character limit reached

Object-Centric Data Setting

Updated 5 December 2025
  • Object-centric data setting is a paradigm that enables events to reference multiple heterogeneous business objects, allowing detailed multi-perspective process analysis.
  • It utilizes multi-table and graph-based schemas to capture dynamic attribute evolution and complex inter-object relationships for scalable analytics.
  • This framework drives advanced applications in process mining, causal learning, computer vision, and IoT by overcoming traditional single-case limitations.

An object-centric data setting is a paradigm for structuring, storing, and analyzing data in which each event is permitted to reference multiple business objects of potentially different types. This setting has become foundational in process mining, causal representation learning, computer vision, and machine learning, as it overcomes limitations of traditional event logs bound to single “case notions.” The object-centric framework captures the rich web of interactions among heterogeneous entities as they co-evolve through a process, enabling detailed, multi-perspective analysis and advanced modeling.

1. Formal Definition and Core Principles

The object-centric data setting replaces the single-case or single-object assumption by allowing events to refer to arbitrary subsets of business objects of designated types. The canonical abstract model is an object-centric event log, represented as a tuple such as

L=(E,O,T,π,τ)L = (E, O, T, \pi, \tau)

where:

  • EE is a finite set of events;
  • OO is a finite set of objects;
  • TT is a finite set of object types;
  • π:E2O\pi : E \rightarrow 2^O maps each event to the set of involved objects;
  • τ:ER+\tau : E \rightarrow \mathbb{R}_+ assigns timestamps.

For process mining and information systems, this structure is concretized in the OCEL (Object-Centric Event Log) model, which augments events and objects with type and attribute functions, recording both static and dynamic (time-varying) information (Miri et al., 13 Mar 2025, Khayatbashi et al., 30 Nov 2024). In the computer vision domain, an object-centric input typically consists of multi-object scenes or sets of object-specific views, with associated object-type and localization information (Đukić et al., 19 Mar 2025, Agarwal et al., 28 Nov 2025).

This setting generalizes earlier models by supporting:

  • Multi-object event referencing: Each event may link to zero, one, or multiple objects possibly of distinct types.
  • Attribute evolution: Object attributes can be static or dynamic, with the latter tracked via explicit temporal tables (Goossens et al., 2022, Goossens et al., 2023).
  • Object-object and event-object relations: Events can be linked to objects by qualified relationships, enabling modeling of roles or causal attributions (Bosmans et al., 1 Oct 2024, Khayatbashi et al., 26 Aug 2025).
  • Heterogeneous schemas: Arbitrary and evolving sets of object types, event types, and attribute keys, with minimal restrictions on the underlying universes.

2. Data Modeling, Schema, and Representation

Object-centric data settings are typically structured in a multi-table or graph-based schema, supporting generality, scalability, and schema evolution. The OCEL 2.0 and DOCEL formats provide formal models capturing event and object universes, mapping functions, and relation tables (Miri et al., 13 Mar 2025, Khayatbashi et al., 30 Nov 2024, Goossens et al., 2022).

A canonical schema includes:

  • Event tables: Containing event ids, types, timestamps, and static attributes.
  • Object tables: Containing object ids, types, and static/dynamic attributes.
  • Event-to-object relation tables: Representing the general (many-to-many) association of events to objects.
  • Object-to-object relation tables: Encoding qualified relationships between objects.
  • Dynamic attribute tables: Capturing time-varying properties of objects, with (object id, event id, value) records for each change (Goossens et al., 2022, Goossens et al., 2023).

A minimal and generic relational schema avoids encoding process-specific logic at the column-level, instead storing all type and attribute information as table rows. This supports streaming data, novel types and attributes, and high-volume ingestion without requiring data definition language (DDL) changes (Bosmans et al., 1 Oct 2024).

3. Analytical Operations and Process Mining

The object-centric data setting enables a diverse set of analytical operations that leverage its rich relational and temporal structure:

  • Granularity adjustment: Analysts can dynamically drill-down, roll-up, unfold, and fold object or event types based on attribute values or event-object relations to achieve desired levels of abstraction. These transformations are formal, reversible, and information-preserving (Khayatbashi et al., 30 Nov 2024).
  • Process scope definition: Multiple, overlapping process boundaries can be explicitly embedded as objects, supporting scope-based aggregation, drill-down, and multi-level analysis. Scope enrichment is enabled via explicit formal grammars and process-object constructs (Khayatbashi et al., 26 Aug 2025).
  • Conformance checking: Object-centric Petri nets with identifiers (OPIDs) and their data-aware extensions (DOPIDs) enable model-based checking of logs with explicit multi-object synchronization, tracking both identity and data-value constraints (Gianola et al., 2023, Gianola et al., 21 May 2025).
  • Predictive analytics and performance metrics: Analysts can extract interaction-based, object-centric features for advanced prediction (e.g., using CatBoost), and define metrics such as synchronization time, pooling time, and lagging time inherent to multi-object orchestration (Galanti et al., 2022, Park et al., 2022).

Table: Key Operations Enabled by Object-Centric Data Setting

Operation Description References
Drill-down/Roll-up Alter object/event type granularity (Khayatbashi et al., 30 Nov 2024)
Scope enrichment Explicit multi-process semantics (Khayatbashi et al., 26 Aug 2025)
Dynamic attributes Time-varying object properties (Goossens et al., 2022)
Object-centric Petri nets Modelling identity and object synchronization (Gianola et al., 2023)
Conformance alignment Multi-object, attribute-rich trace alignment (Gianola et al., 2023, Gianola et al., 21 May 2025)

These operations facilitate multidimensional, multi-level, and multi-perspective process analysis, supporting applications ranging from process discovery, compliance checking, decision mining, and advanced predictive analytics (Goossens et al., 26 Jan 2024, Khayatbashi et al., 30 Nov 2024, Galanti et al., 2022).

4. Implementation, Data Engineering, and Scalability

A robust object-centric data setting demands scalable, flexible, and maintainable system design (Bosmans et al., 1 Oct 2024). Key requirements include:

  • Physical schema generality: All processes, event types, object types, and attributes must be encoded as data rather than schema (table rows rather than columns), supporting straightforward schema evolution.
  • Hub-and-spoke architecture: A central data hub abstracts away individual source system peculiarities, supports append-only, asynchronous ingestion, and exports to multiple formats (e.g., OCEL, DOCEL, Neo4j) (Bosmans et al., 1 Oct 2024).
  • Automated data quality assessment: Enforced at ingestion to ensure key constraints, typing, and referential integrity.
  • Streaming and incremental updates: Native support for new object and event types, out-of-order data, and large-scale partitioning.
  • Open-source tools and libraries: End-to-end stacks (e.g., Stack’t, processmining) provide ETL pipelines, transformation APIs, validation, and interactive visual analytics (Bosmans et al., 1 Oct 2024, Khayatbashi et al., 30 Nov 2024).

This infrastructure underpins both research-grade and production-scale process mining, business intelligence, and ML/AI applications.

5. Methodological and Theoretical Implications

The object-centric data setting introduces a fundamental shift in methodological assumptions across disciplines:

  • Declarative modeling: Object-centric models (e.g., OCBC) enable declarative specification of behavioral constraints over activities, supporting many-to-many and one-to-many associations, cardinality constraints, and process rules tied to object classes and relationships (Aalst et al., 2017).
  • Causal identifiability: In causal representation learning, object-centric architectures restore injectivity and enable efficient object-level disentanglement with weaker supervision than vectorized models, requiring only order-of-d latent perturbations rather than kd (Mansouri et al., 2023).
  • Anomaly detection: Feature extraction, propagation, and anomaly detection leverage bipartite incidence matrices and object-interaction graphs derived from the object-centric event relation (Berti et al., 12 Jul 2024).
  • Formal verification and certification: Alloy-based frameworks (e.g., FOCED) enable rigorous verification of data meta-models, structural constraints, and temporal rules before analytics, preventing data integrity loss on import or transformation (Latif et al., 10 Nov 2025).

Misconceptions may arise if case concepts are artificially imposed onto an object-centric setting, leading to loss of process interdependence and incorrect performance or conformance metrics (Park et al., 2022). The object-centric model is provably expressive (strictly more general than Declare or XES trace-based models), efficiently checkable, and fosters transparency in multi-object event tracing (Aalst et al., 2017, Miri et al., 13 Mar 2025).

6. Applications and Impact

The object-centric data setting is broadly applicable to:

The setting is central to scalable, expressive, and rigorous analytics in evolving, multi-entity domains, aligning data structures with the true complexity of enterprise, scientific, and perceptual systems.


References: (Miri et al., 13 Mar 2025, Khayatbashi et al., 30 Nov 2024, Khayatbashi et al., 26 Aug 2025, Goossens et al., 2022, Goossens et al., 2023, Aalst et al., 2017, Bosmans et al., 1 Oct 2024, Đukić et al., 19 Mar 2025, Agarwal et al., 28 Nov 2025, Gianola et al., 2023, Gianola et al., 21 May 2025, Goossens et al., 26 Jan 2024, Mansouri et al., 2023, Galanti et al., 2022, Park et al., 2022, Berti et al., 12 Jul 2024, Latif et al., 10 Nov 2025, Jiang et al., 2023)

Definition Search Book Streamline Icon: https://streamlinehq.com
References (18)

Whiteboard

Follow Topic

Get notified by email when new papers are published related to Object-Centric Data Setting.