XGBoost Fatigue Recognition

Updated 29 December 2025

XGBoost Fatigue Recognition is an approach that applies the ensemble learning technique XGBoost to detect human fatigue using physiological, behavioral, and appearance-based signals.
The methodology employs robust feature extraction, including HRV metrics and sliding window aggregation, with SHAP-based interpretability enhancing model insights.
Studies show that XGBoost can match or outperform traditional methods in accuracy while highlighting challenges such as adversarial vulnerability and the need for richer multimodal integration.

XGBoost Fatigue Recognition refers to the application of eXtreme Gradient Boosting (XGBoost), a high-performance ensemble learning technique, to the detection and prediction of human fatigue using physiological, behavioral, or appearance-based data streams. Recognizing fatigue reliably is pivotal for domains including driver monitoring, workplace safety, healthcare, and human–machine teaming, where lapses in alertness have critical consequences. This article details the technical landscape of XGBoost-based fatigue recognition, spanning signal domains, feature engineering, modeling protocols, interpretability via SHAP, robustness under adversarial perturbation, and practical limitations.

1. Physiological and Behavioral Data Sources

Fatigue manifests through numerous physiological (e.g., electrocardiogram [ECG], electroencephalogram [EEG], heart rate variability [HRV]), behavioral (e.g., steering wheel movement, posture), and appearance-based (e.g., facial landmarks) signals. Representative data inputs in the literature include:

Electrocardiogram and HRV: Time-domain (SDNN, RMSSD, pNN50), nonlinear (SD1, SD2, SD2/SD1 ratio from Poincaré plots), and frequency-domain (VLF, LF, HF powers) features are routinely extracted per time window to capture cardiac autonomic fluctuations indicative of fatigue. (Vitorino et al., 2023)
Facial landmarks: Eye Aspect Ratio (EAR) and Mouth Aspect Ratio (MAR), computed from specific facial keypoints, provide parsimonious descriptors for eye closure and yawning associated with drowsiness. (Chen et al., 2023)
Behavioral and Multimodal Measures: Heart rate, breathing rate, HRV, steering wheel angle/torque, and posture (slouching, tilting) are synthesized into a multivariate feature set for regression against fatigue markers such as PERCLOS (percentage of eyelid closure over time). (Zhou et al., 2021)

The choice of signal modality determines pipeline design and interpretability, with lower-intrusion signals (ECG, facial landmarks) favored for real-time systems.

2. Feature Extraction and Signal Preprocessing

Fatigue recognition pipelines generally employ structured pre-processing and feature engineering:

Facial Feature Computation: EAR is defined as

$\mathrm{EAR} = \frac{ \|P_2 - P_6\| + \|P_3 - P_5\| }{ 2\,\|P_1 - P_4\| }$

where $P_i$ denote landmark coordinates for the eye. MAR is analogously defined for mouth landmarks. (Chen et al., 2023)

HRV Feature Definitions: For RR intervals $\{RR_1, \dots, RR_N\}$ ${R R_{1}, \dots, R R_{N}}$ ,
- SDNN: $\mathrm{SDNN} = \sqrt{ \frac{1}{N-1} \sum_{i=1}^{N} (RR_i - \mu_{RR})^2 }$
- RMSSD: $\mathrm{RMSSD} = \sqrt{ \frac{1}{N-1} \sum_{i=1}^{N-1} (RR_{i+1} - RR_i)^2 }$
- Nonlinear: SD2/SD1, DFA $\alpha_1$
- (Vitorino et al., 2023)
Windowed Aggregation: For time-varying signals, features are extracted over sliding windows of 120-150s, balancing temporal resolution with physiological informativeness. (Vitorino et al., 2023)
Labeling Strategies: Labels may derive from behavioral ground truth (e.g., PERCLOS or PVT reaction times) or expert annotation; binary or regression targets are supported. (Zhou et al., 2021, Vitorino et al., 2023, Chen et al., 2023)
Preprocessing Routines: Signal normalization, artifact rejection (especially in ECG/EEG), grayscale conversion (for images), and landmark normalization are standard. (Chen et al., 2023, Vitorino et al., 2023, Zhou et al., 2021)

3. XGBoost Modeling Paradigms

XGBoost applies gradient-boosted decision trees, leveraging regularized loss minimization and efficient second-order optimization. Protocols include:

Objective Formulations:
- Binary classification: $\ell(y_i,\hat y_i) = -[y_i \log \hat y_i + (1-y_i)\log(1-\hat y_i)]$ (Chen et al., 2023, Vitorino et al., 2023)
- Regression: $\ell(y,\hat y) = (y-\hat y)^2$ (Zhou et al., 2021)
Regularization: $\Omega(f) = \gamma T + \frac{1}{2} \lambda \|w\|^2$ , where $T$ is leaf count, $w$ are leaf weights. (Zhou et al., 2021, Vitorino et al., 2023)
Hyperparameterization:
- $n_\mathrm{estimators} = 100-2000$
- $max\_depth = 6-10$
- $learning\_rate = 0.1$ (regression), $0.3$ (classification, default)
- $subsample$ , $colsample\_bytree = 0.8-1.0$
- $reg_\alpha = 1$ , $reg_\lambda = 1.0$
- Grid search and cross-validation optimize these values. (Zhou et al., 2021, Vitorino et al., 2023, Chen et al., 2023)
Training/Validation Split: 70/30 train/test is conventional, with stratification for binary tasks; 10-fold cross-validation used for small samples. (Zhou et al., 2021, Vitorino et al., 2023, Chen et al., 2023)
Feature Reduction: SHAP-based selection may reduce the feature set (e.g., to 18 HRV metrics) with no loss in discrimination. (Vitorino et al., 2023)

4. Performance Metrics and Comparative Outcomes

Evaluation of XGBoost fatigue models references metrics aligned with task formulation:

Study	Data Domain	Task	Key Metrics	Top XGBoost Result
(Chen et al., 2023)	Facial images	Binary (fatigue)	Accuracy, sensitivity	Accuracy: 87.37%, Sensitivity: 89.14%
(Vitorino et al., 2023)	ECG/HRV	Binary (drowsiness)	Macro-F1, ROC-AUC	F1: 77.3% (150s window), AUC: ~0.85
(Zhou et al., 2021)	Multimodal	Regression (PERCLOS)	RMSE, MAE, Adjusted $R^2$	RMSE: 3.847, MAE: 1.768, $R^2$ : 0.996

Broader analyses yielded:

Performance Parity or Superiority: XGBoost outperforms or matches SVM, Random Forest, and regression baselines in both classification and regression fatigue prediction tasks. (Vitorino et al., 2023, Zhou et al., 2021)
Window Optimization: HRV-based XGBoost models are optimal in the 120–150s window regime for responsiveness and physiological discriminability. (Vitorino et al., 2023)
Feature Compression: Restricting to SHAP-selected feature subsets (e.g., 18 HRV metrics) preserves F1. (Vitorino et al., 2023)

5. Interpretability via SHAP and Main Effects

Interpretability is addressed using SHAP (SHapley Additive exPlanations):

SHAP Assignment: For each feature $j$ and instance $i$ , SHAP value $\phi_{i,j}$ quantifies the contribution to output deviation from the mean prediction.
Global Explanations: Features ranked by mean absolute SHAP values identify key predictors (e.g., HRV nonlinear SD2/SD1 ratio, pNN50, RMSSD, mean HR). (Vitorino et al., 2023, Zhou et al., 2021)
Local Explanations: Individual predictions are decomposed into per-feature contributions, enabling actionable insights for intervention (e.g., breathing exercises to raise $br\_avg60$ and lower fatigue). (Zhou et al., 2021)
Dependence Plots: Visualization of main effects—such as V-shaped HRV dependence or monotonic effects of heart/breath rate—enable physiological interpretation and operational guidelines. (Zhou et al., 2021)

6. Robustness Considerations and Adversarial Analysis

Model reliability under input uncertainty is critical for practical fatigue monitoring systems:

Adversarial Vulnerability: XGBoost models experience F1 degradation under physiologically plausible adversarial attacks, with drops ranging 15–18 points (e.g., F1: 77.28% → 62.2%). (Vitorino et al., 2023)
Adversarial Training: Augmentation with A2PM-perturbed samples during training restores F1 within 1–2 points of clean performance and increases resistance relative to SVM/KNN baselines. (Vitorino et al., 2023)
Best Practices: Use of realistic, domain-constrained perturbation methods and feature selection improve deployment security.

7. Limitations and Forward Directions

Current XGBoost fatigue recognition frameworks expose several open challenges:

Feature Scope: Many studies limit features (e.g., only EAR/MAR for facial analysis, or exclude temporal dynamics), restricting model generalizability. (Chen et al., 2023)
Dataset Size and Diversity: Limited subject pools and poorly specified dataset scales impede reproducibility and external validity. (Chen et al., 2023)
Temporal Analysis: Incorporating sequential dependencies (e.g., via temporal voting or recurrent architectures) is recommended for framewise video streams and real-time monitoring. (Chen et al., 2023)
Broader Modal Integration: Multimodal signal fusion (e.g., combining ECG, EMG, EEG, and EOG) can further enhance accuracy, robustness, and translatability, as evidenced in studies evaluating cross-signal relationships, though explicit XGBoost implementations with these fusions require fuller technical disclosure. (Kakhi et al., 26 Sep 2025)

Fatigue recognition with XGBoost thus demonstrates strong performance, interpretability, and robustness under informed modeling and evaluation frameworks, but ongoing refinements in feature engineering, adversarial resilience, temporal modeling, and cross-modal generalization remain active areas of research.