Threshold-based Predictive Maintenance

Updated 27 April 2026

Threshold-based Predictive Maintenance is a strategy that initiates maintenance when monitored health metrics exceed predefined or adaptive thresholds.
It integrates statistical process control, machine learning-based anomaly detection, and cost-optimization techniques to balance reliability and economic performance.
Empirical studies show TBPM achieves high true positive rates and early warning capabilities across various industrial applications such as rotating machinery and wind turbines.

Threshold-based predictive maintenance (TBPM) refers to the class of methodologies in which maintenance actions are triggered when a monitored health indicator—directly observable, proxy-derived, or inferred—crosses a predefined or adaptively determined threshold. TBPM strategies provide actionable decision rules across a range of industrial applications including equipment with well-characterized degradation, high-dimensional sensor arrays, and assets lacking direct condition monitoring, often enabling the integration of maintenance logistics and cost optimization under uncertainty.

1. Fundamental Concepts and Mathematical Formulation

TBPM operates by monitoring a health metric (or set of metrics) and invoking maintenance—preventive, corrective, or inspection—upon exceedance of an established threshold. The core components are:

Degradation Process Modeling: For classical TBPM, system degradation is modeled as a stochastic process, such as a gamma process with state $X(t)$ and failure threshold $L$ . Preventive maintenance is initiated when $X(t)$ reaches a preventive threshold $M < L$ (Soltani, 2018).
Health Indicator (HI)-based Formulation: In feature-driven or machine learning settings, an HI is constructed—either as a univariate feature, an aggregated score, or the output of an unsupervised/supervised model—and alarm logic is formulated as $\text{Alarm at } t \text{ if } HI(t) > \tau$ (Hamaide et al., 2022).
Threshold Definition: Thresholds can be static, based on empirical statistics (e.g., $\tau = \mu + k \sigma$ ), adaptive via regression-residual control charts, or determined through profit-based cost optimization (Kenbeek et al., 2016, Begun et al., 2024).

These control rules can be subject to further constraints, such as inventory availability or probabilistic risk bounds, yielding objective formulations that explicitly minimize long-run average costs or business metrics under availability and operational constraints.

2. Sensor Data, Feature Processing, and Anomaly Scoring

The choice of HI and its preprocessing pipeline is application dependent:

Direct Sensor Feature Monitoring: Scalar or low-dimensional physical parameters (vibration, temperature) are modeled directly. Temporal smoothing (moving averages, exponentially weighted moving averages) enhances signal stability (Hamaide et al., 2022).
Machine Learning-based Scoring: Unsupervised (autoencoders, one-class SVM) or supervised (binary SVM, SVR) mappings aggregate high-dimensional sensor arrays into an HI. For autoencoder approaches, the squared $\ell_2$ reconstruction error serves as an anomaly score $E(x)$ (Givnan et al., 2021).
Regression/Residual Modeling: For complex systems (e.g., wind turbines), adaptive regression models on environmental and operational covariates yield residuals $r_t = Y_t - \widehat{Y}_t$ , which are monitored against learned distributions (Kenbeek et al., 2016, Begun et al., 2024).

TBPM accommodates both direct detection of physical wear and indirect anomaly signaling through data-driven surrogates, often requiring explicit recalibration after corrective actions to account for post-repair operating shifts.

3. Threshold Selection, Calibration, and Adaptivity

Threshold determination in practice employs several methodologies:

Empirical Statistical Calibration: In unsupervised learning or anomaly detection, thresholds are set at high quantiles of the error distribution for healthy data, e.g., $\tau = \mu_E + k \sigma_E$ , where $L$ 0 is selected for desired false positive/negative balance (Givnan et al., 2021).
Statistical Process Control Frameworks: Regression-adjusted residuals are assigned static or dynamic $L$ 1 control limits, possibly within a sliding window, enabling sensitivity trade-offs (Kenbeek et al., 2016).
Profit- or Cost-based Tuning: In settings with rare failures, thresholds are sampled in proportion to their profitability or risk-penalty, using explicit cost functions integrating false positive and false negative costs, and anticipated savings from early interventions (Begun et al., 2024).
Cross-Validation/Optimization: Double cross-validation selects threshold and any temporal smoothing window to maximize composite metrics that balance detection accuracy, lead-time, and economic objectives (Hamaide et al., 2022).

Dynamic recalibration—after major repairs, concept drift, regime changes—is often implemented, resetting local means for residual/CUSUM detectors to maintain robust operating baselines after non-stationary events (Begun et al., 2024).

4. Decision Logic and Maintenance Policy Integration

The TBPM framework interfaces tightly with wider maintenance and logistics processes:

Event-driven Inspection and Action: Inspection timings are computed to guarantee an upper bound $L$ 2 on failure probability within any observation interval, and maintenance is performed if the HI or degradation crosses threshold at inspection (Soltani, 2018).
Traffic-light Logic: Multi-level thresholds (e.g., green/amber/red) provide graded warnings; for instance, three anomaly bands denote action urgency and drive operator decisions for pre-fault interventions (Givnan et al., 2021).
Inventory and Spare Part Coordination: Joint optimization of preventive-threshold, re-order levels, and maximum number of imperfect interventions links TBPM to logistics planning, subject to availability constraints and multi-location part flows (Soltani, 2018).
Cost-aware Scheduling: Especially in indirect/proxy data regimes, probabilities that an asset is over age/cycles relative to its maintenance threshold are used as risk/penalty terms in mixed-integer maintenance scheduling to balance cost, disruption, and grouping efficiencies (Bauer et al., 15 Mar 2026).

5. Empirical Performance, Sensitivity, and Comparative Metrics

TBPM approaches are evaluated using both statistical-detection metrics and application-specific business scores:

Detection Rates and Lead Time: In real-world rotating-equipment studies, simple threshold-based univariate HIs achieve high TPR (90.9%) and low FPR (5.1%), comparable to or exceeding more complex SVM-based models (Hamaide et al., 2022).
Lead Time and False Alarms: TBPM’s early warning capability is tunable via threshold position and window length. In autoencoder models, alarms are issued 70 minutes prior to failure at moderate false positive rates (Givnan et al., 2021). In wind turbines, anomaly alarms appeared 6–12 weeks before failures (Kenbeek et al., 2016).
Cost and Profit Metrics: Explicit cost-based evaluation in rare-failure wind turbine settings demonstrates that threshold-based methods, when sampled by profit, outperform both random and reactive policies, yielding empirical distributions of net maintenance savings (Begun et al., 2024).
Sensitivity and Trade-offs: Lowering thresholds reduces missed alarms but increases false positives; smoothing/aggregation windows mediate detection granularity versus noise robustness (Hamaide et al., 2022, Givnan et al., 2021).

Key empirical insights are summarized in the following table:

Application	TPR	FPR	Lead Time	Notable Result
Rotating machine (univariate HI)	90.9%	5.1%	up to 7 days	EWMA-smoothed HI matches/bests SVM models (Hamaide et al., 2022)
Rotary machine (autoencoder)	—	—	70 min pre-fault	Three traffic-light levels reduce subjective thresholding (Givnan et al., 2021)
Wind turbine (adaptive SPC)	~90%	~7-12%	6–12 weeks	Adaptive residual chart flags rare events earlier than fixed limits (Kenbeek et al., 2016)
Wind turbine (profit-based)	—	—	34–60 days	Threshold-sampling achieves lower mean and min costs than random/reactive (Begun et al., 2024)

6. Advanced and Non-standard TBPM Extensions

Recent research extends TBPM outside the classical supervised regime:

Proxy-Data Models: For assets without direct cycle counters (e.g., railway doors), aggregate flows are inferred via Bayesian models on passenger routing, and maintenance is triggered using stochastic cycle/age accumulators compared to uncertain thresholds (Bauer et al., 15 Mar 2026).
CUSUM Approaches: Accumulated deviations or residuals (post-calibration to compensate for maintenance resets) enable robust detection of small, persistent drifts, which are critical in rare-failure or weak-signal settings (Begun et al., 2024).
Maintenance Scheduling Integration: Binary decision variable frameworks minimize disruption, setup, and delay/penalty costs subject to per-asset and grouped asset constraints, using probabilistic overdue estimates as scheduling drivers (Bauer et al., 15 Mar 2026).
Statistical Guarantee Policies: Event-driven inspections based on maximum tolerated failure risk per interval provide distributional control over operational reliability (Soltani, 2018).

A recurring theme is the quantification and explicit propagation of uncertainty—from sensor modeling, through anomaly detection, to final cost/risk evaluation.

7. Limitations and Directions for Further Research

Key limitations arise from the underlying data and process models:

Optimality under Scarce Data: In rare-failure settings, threshold selection must accommodate the stochasticity of both failures and detection, often requiring calibration across turbines/assets or profit-based threshold sampling rather than optimization (Begun et al., 2024).
Imperfect Maintenance Effects: Successive imperfect interventions can accelerate degradation, mandating explicit constraints on allowable maintenance sequences and integration with spare part logistics (Soltani, 2018).
Drift, Reset, and Concept Drift: Model recalibration post-maintenance, accounting for non-stationarity, and adjusting thresholds in the presence of regime shifts are open research challenges (Begun et al., 2024, Givnan et al., 2021).
Interpretability and Human Factors: Traffic-light systems and graded alarms address subjective thresholding but must be balanced against the risk of over-alerting or excessive cost (Givnan et al., 2021).
Generalizability to Indirect Monitoring: Proxy- and model-based TBPM solutions allow deployment for low-data regimes but rely on accurate propagation of model and data uncertainty throughout the action chain (Bauer et al., 15 Mar 2026).

A plausible implication is that the integration of TBPM with cost-aware, uncertainty-guided, and data-driven logic will continue to expand its efficacy and operational value, particularly as industrial systems shift towards automated, IoT-based maintenance regimes.

References:

“Joint Optimization of Opportunistic Predictive Maintenance and Multi-location Spare Part Inventories for a Deteriorating System Considering Imperfect Actions” (Soltani, 2018)
“Data-driven online monitoring of wind turbines” (Kenbeek et al., 2016)
“Cost-optimized probabilistic maintenance for condition monitoring of wind turbines with rare failures” (Begun et al., 2024)
“Low-Data Predictive Maintenance of Railway Station Doors and Elevators Using Bayesian Proxy Flow Modeling” (Bauer et al., 15 Mar 2026)
“A two-level machine learning framework for predictive maintenance: comparison of learning formulations” (Hamaide et al., 2022)
“Real-Time Predictive Maintenance using Autoencoder Reconstruction and Anomaly Detection” (Givnan et al., 2021)