Process Model Forecasting (PMF)

Updated 15 December 2025

PMF is a forecasting methodology that models the evolution of process structures by analyzing event log directly-follows relations.
It employs classical time series methods and advanced Time Series Foundation Models to map historical behavior to future process graph representations.
Extensions incorporate resource-aware and probabilistic approaches, enhancing accuracy and revealing emerging process trends and bottlenecks.

Process Model Forecasting (PMF) aims to predict the future structural evolution of process models over time, focusing on the temporal dynamics of control-flow relations (particularly directly-follows, DF, relations) derived from event logs. PMF complements predictive process monitoring, which targets single-case or instance-level forecasts, by providing system-level forecasts that characterize how the overall process model may evolve, drift, or exhibit emerging behavioral changes (Yu et al., 8 Dec 2025, Smedt et al., 2021). PMF problems are typically formulated as multivariate time series forecasting tasks, where the objective is to map historical DF (or related) series to their future values or to future process model representations.

1. Formalization and Definitions

Let $\mathcal{L} = \{\sigma_1, \ldots, \sigma_N\}$ denote an event log—a multiset of cases, each $\sigma_i$ being an ordered sequence of timestamped events, with the finite set of activity labels $\mathcal{A}$ . For each $(a_p, a_q) \in \mathcal{A} \times \mathcal{A}$ , $>_{\mathcal{L}}(a_p, a_q)$ counts the number of times $a_q$ directly follows $a_p$ in $\mathcal{L}$ . Process evolution is tracked via a sequence of directly-follows graphs (DFGs), $DFG_t = (V, E_t)$ with $V = \mathcal{A}$ and $E_t = \{(a_p, a_q, w_{pq}^{(t)}) | w_{pq}^{(t)} = >_{\mathcal{L}_t}(a_p, a_q)\}$ , where $\mathcal{L}_t$ denotes the log partition for time window $t$ .

Each DF relation $(a_i, a_j)$ generates a univariate time series $x_t^{(i,j)} = w_{ij}^{(t)}$ . The objective of PMF is $h$ -step-ahead forecasting: learn $f_\theta$ such that $\{x_1^{(i,j)},...,x_T^{(i,j)}\} \rightarrow \{x_{T+1}^{(i,j)},...,x_{T+h}^{(i,j)}\}$ , or, equivalently, $f_\theta: \{DFG_1,...,DFG_T\} \rightarrow \{DFG_{T+1},...,DFG_{T+h}\}$ (Yu et al., 8 Dec 2025, Smedt et al., 2021). Depending on the research context, PMF may target lower-dimensional process-level KPIs (such as throughput time), or higher-dimensional behavior graphs, and it allows for extension to include additional structural or resource-centric features (Leribaux et al., 13 Oct 2025).

2. Classical and Statistical PMF Techniques

The foundational approach encodes the process event log as a multivariate set of DF time series, one for each pair $(a,b)$ , and applies established time series forecasting methods independently to each series (Smedt et al., 2021). Canonical methods include:

Naïve/Seasonal forecasting: projecting the most recent observed value or using seasonal patterns, e.g., $\hat x_{t+h} = x_{t-h+1}$ .
Exponential Smoothing: Simple Exponential Smoothing and Holt’s method, parameterized by $(\alpha, \beta)$ , model level and trend in the DF frequencies (Smedt et al., 2021).
AR(p) and ARIMA(p,d,q) models: Model the autocorrelated structure and nonstationarity in DF series; order selection uses Box–Jenkins methodology, and forecasting employs maximum likelihood–fitted parameters.
GARCH augmentation: Addresses non-constant variance in the residuals by modeling time-varying volatility (Smedt et al., 2021).

Edge-weight forecasts are rounded and thresholded, resulting in a predicted process graph $\widehat{DFG}_{T+h}$ that can be directly compared to ground-truth DFGs from future event log segments.

3. Time Series Foundation Models for PMF

Recent advances center on leveraging large, pre-trained Time Series Foundation Models (TSFMs) for PMF (Yu et al., 8 Dec 2025). TSFMs—such as Chronos (T5-style encoder–decoder or encoder-only), MOIRAI (masked encoder, decoder-only, and mixture-of-experts variants), and TimesFM (decoder-only Transformer)—are trained on vast heterogeneous time series corpora and can transfer this temporal knowledge to process mining domains. Key features:

Architectures: Span encoder–decoder, encoder-only, and decoder-only Transformers; adapt quantized scalar representations, patching, and autoregressive or direct quantile objectives.
Forecasting modes:
- Zero-shot: Directly applies the pre-trained TSFM on DF time series without any PMF-specific fine-tuning. For quantile models, the median over $K$ sampled trajectories is used as the point forecast.
- Parameter-Efficient Fine-Tuning (LoRA): Introduces low-rank updates to attention layers with only LoRA matrices trainable, base weights frozen; enables lightweight adaptation.
- Full Fine-Tuning: All parameters updated using PMF-specific data.

Empirical results show that zero-shot TSFMs achieve 15–28% lower MAE and 20–33% lower RMSE compared to traditional and machine-learning models on standardized PMF benchmarks. Fine-tuning yields marginal gains ( $<$ 5%) on large, regular logs, but may lead to overfitting or even degrade performance on small or irregular logs. TSFMs exhibit strong generalization across process domains and provide data-efficient PMF solutions (Yu et al., 8 Dec 2025).

4. Process-Aware and Resource-Informative PMF Extensions

PMF extends beyond control-flow structural predictions to resource-aware and actor-enriched forecasting:

Actor-Enriched Time Series: Time-varying signals capturing actor involvement (continuation, interruption, handover behaviors) can be aligned as additional explanatory series. Daily counts ( $F_b(d)$ ) and durations ( $T_b(d)$ ) for behaviors $b$ serve as input features (Leribaux et al., 13 Oct 2025).
Model Classes: Gradient-boosted trees (XGBoost, LightGBM) effectively exploit structured actor features, outperforming univariate ARIMA and RNN variants (LSTM/GRU) in RMSE and $R^2$ across real logs. The inclusion of actor features reliably improves PMF performance, with tree-based models yielding stable gains and hybrid deep learning models (Conv1D + RNN + attention) delivering marginally smaller but consistent improvements (Leribaux et al., 13 Oct 2025).

This suggests that resource signals are second-order drivers of process evolution and should be systematically incorporated into PMF pipelines.

5. Advanced Probabilistic and Process-Informed Forecasting Frameworks

Sophisticated PMF variants include probabilistic and process-informed models:

Pairwise Gaussian Markov Models (PMMs): Extend HMMs to admit arbitrary cross-covariances between observed and hidden process variables. PMMs, implemented via a 2-D Kalman filter, yield substantial reductions in filtering MSE (up to 5–10 $\times$ ) compared to HMMs, especially for non-Kalmanian process covariance structure (Escudier et al., 12 Feb 2024).
Input–Output Hidden Markov Models (IO-HMMs): Allow process mode transitions and output distributions to directly depend on observed covariates (e.g., shift type, environmental factors). Recursive Bayesian updates (Dirichlet-multinomial for discrete transitions, recursive least-squares for Gaussian regression parameters) allow for adaptive multivariate PMF with uncertainty quantification and low computational cost. The IO-HMM approach demonstrates 5–15% RMSE improvements over univariate or persistence baselines (Miguelez et al., 12 Mar 2024).
Process-Informed Forecasting (PIF): Incorporating a recipe-informed prior (piecewise-linear trajectory) as a soft process model into both classical (ARIMA, ETS) and deep neural architectures (KAN, LSTM, Transformer) via custom loss terms (fixed-weight, uncertainty-based, RBA) yields physically plausible, robust, and generalizable forecasts. PIF improves gradient alignment and noise resilience, and enables rapid transfer learning across similar thermal processes (Rubini et al., 24 Sep 2025).

6. Evaluation Metrics and Empirical Findings

Standard metrics for PMF evaluation include:

Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE): Assessed per DF time series and averaged.
Entropic Relevance (ER): Measures the explanatory power of a forecasted DFG for a set of traces, defined as the average code length under a Markov model induced by the graph. Lower ER indicates higher fidelity.
MAPE on ER: Quantifies the deviation between the forecasted and true future DFGs.

Empirical results (Yu et al., 8 Dec 2025, Smedt et al., 2021) consistently show that:

For moderate-to-large logs with discernible trends and seasonality (e.g., BPI2017, BPI2019_1, Hospital Billing), zero-shot TSFMs and classical ARIMA/SES methods robustly forecast process evolution, with the former outperforming the best specialized baselines.
On logs with irregular, sparse, or highly complex behavior (e.g., Sepsis), fine-tuning or hybrid approaches provide inconsistent gains, and zero-shot TSFMs remain a preferred default.
Pruning the forecasted DFG to retain only the most frequent transitions (top $>50\%$ ) drastically reduces error rates and reveals structural process changes (e.g., bottlenecks or drift) (Smedt et al., 2021).

7. Challenges, Limitations, and Future Directions

Current PMF research faces several challenges:

Sparsity and Heterogeneity: DF time series are often sparse and highly heterogeneous; purely data-driven methods may struggle without cross-domain temporal knowledge (Yu et al., 8 Dec 2025).
Evaluation Scope: Most studies benchmark on a limited set of public event logs; further validation on more diverse and complex processes is required.
Scalability and Overfitting: Fine-tuning high-parameter models may overfit on low-signal or small datasets; robust PEFT strategies and cross-validation are necessary.
Structural Awareness: Purely edge-wise or univariate modeling disregards inherent structural constraints of processes. Future directions include explicitly encoding graph constraints (e.g., via graph-aware losses), integrating case-level and system-level forecasting, and developing interactive frameworks for incorporating PMF outputs into process mining analytics (Yu et al., 8 Dec 2025).

A plausible implication is that systematic PMF adoption in industry will benefit from combining foundation models for generalization, resource/actor-enriched inputs for interpretability, and process-informed regularization for physical plausibility and regulatory compliance. The integration of domain priors, advanced sequence modeling, and process mining visualization tools is an open area for research and methodological innovation.