Papers
Topics
Authors
Recent
2000 character limit reached

In-Cabin Driver Monitoring Systems

Updated 2 January 2026
  • In-cabin Driver Monitoring Systems are integrated sensor suites that continuously track driver physiology and behavior to ensure safety.
  • They utilize a range of modalities, including visible, IR, radar, and wearables, combined through fusion strategies for robust state estimation.
  • Advanced algorithms, from deep learning to classical methods, enable real-time detection of drowsiness, distraction, and other critical driver states.

An in-cabin Driver Monitoring System (DMS) is an integrated suite of sensing, signal processing, and algorithmic classification modules designed to ascertain and continually track the physical, cognitive, and behavioral state of vehicle occupants, with particular emphasis on detecting conditions like distraction, drowsiness, cognitive overload, and seatbelt compliance. These systems address stringent requirements for real-time safety intervention, accuracy, robustness to environmental and behavioral variance, low false alarm rates, and minimal computational and power footprint in production vehicles. DMS research encompasses sensor technologies (visible, IR, neuromorphic, radar, wearables), deep learning and classical machine learning, multi-modal fusion strategies, benchmarking datasets, and system-level design for going beyond detection to trusted driver-vehicle interaction and future agentic or robotic augmentation.

1. Key Sensed Driver States, Indicators, and Measurement Modalities

In-cabin DMS research organizes monitored states into five principal “substates,” each with associated physiological and behavioral indicators, mapped to sensor modalities (Halin et al., 2021):

  • Drowsiness: PERCLOS (percentage eyelid closure ≥70%), blink frequency, mean blink duration, EEG theta power, HR/HRV metrics (RMSSD, SDNN), breathing rate, head pose dynamics; measured via IR/RGB camera, PPG/ECG, radar, thermal imaging.
  • Mental Workload: HR/HRV (LF/HF ratio), EEG α/θ modulation, pupil diameter, gaze dispersion, SDLP (lane position variance); acquired from wearables, cameras, and vehicle CAN-bus.
  • Distraction: Hand positions, gaze direction and entropy, EOR (eyes-off-road) duration, auditory distraction via pupillary and EEG measures; multi-modal acquisition from cameras, microphones, IMUs.
  • Emotions (“Stress/Anger”): Facial expressions (AU detection), vocal prosody, HR/HRV, EDA, erratic braking/steering; sensors include cameras, cockpit microphones, wearables.
  • Influence (Alcohol/Drugs): HR, facial temperature (thermal IR), BAC sensors, gaze instability, behavioral erraticism.

Table I/II in (Halin et al., 2021) details exhaustive mapping between state, indicator, and sensor (EEG, ECG, PPG, radar, visible/NIR/thermal/neuromorphic cameras, CAN-bus, wearables).

2. Sensing Technologies and Data Acquisition

Modern DMS utilize a synergistic array of sensors (Farooq et al., 2023, Kielty et al., 2023, Tavakoli et al., 2021, Hariharan et al., 2023):

This sensor constellation enables redundancy, coverage in challenging lighting (night, glare), and multimodal robustness to individual or environmental failure cases.

3. Feature Extraction, Multi-Modal Fusion, and System Architecture

DMS pipelines extract high-level features and employ fusion strategies to improve specificity and reliability (Lin et al., 2024, Farooq et al., 2023, Tavakoli et al., 2021, Sini et al., 2023, Kielty et al., 2023, Hu, 2022):

Such architectures permit real-time, energy-efficient, and robust multi-class behavior/action recognition and physiological state estimation under diverse conditions.

4. Algorithms, Model Architectures, and Training Protocols

DMS employ classical ML, deep learning, vision-LLMs, and mixture-of-experts approaches, supported by significant advances in model architecture and dataset development (Hu, 2022, Kielty et al., 2023, Riya et al., 2024, Cañas et al., 15 Mar 2025, Ortega et al., 2020):

Reported metrics include precision, recall, F1, Top-1 accuracy, latency (often <40 ms/frame for vision modules, sub-ms for event-based), and error measures specific to task (seatbelt, blink, yawn, gaze zone).

5. System Integration, Deployment, and Real-World Constraints

Deployment requirements and architectural solutions include embedded hardware optimization, pipeline latency management, sensor placement, and live-system trustworthiness (Ahsani et al., 26 Dec 2025, Hariharan et al., 2023, Kielty et al., 2023, Cañas et al., 29 Apr 2025):

  • Hardware and Latency
    • Embedded SoCs: NVIDIA Jetson, TI-TDA4VM, Raspberry Pi, Coral Edge TPU; optimized for INT8 quantized inference, DMA/FPGA acceleration, per-frame latency <60 ms for full pipelines, >30 FPS operational speed (Ahsani et al., 26 Dec 2025, Hariharan et al., 2023).
  • Placement and Lighting Robustness
  • Fusion and Decision Modules
    • Central DMS ECU aggregates seatbelt, gaze, blink, pose, drowsiness; event-based seatbelt and blink modules operate at >30 Hz for real-time state change detection (Kielty et al., 2023).
    • Dual-camera pipelines for occlusion-aware fallback; RGB primary, IR backup during persistent occlusion/low light; region-based gaze, ID, occlusion via EfficientNet/MobileNet features (Cañas et al., 29 Apr 2025).
  • Human-Centered and Agentic Intelligence
    • Behavioral signals (drowsiness, distraction, engagement) routed to higher-order decision modules for personalized interventions, handover readiness in SAE Level 3/4 vehicles; privacy-first, on-device inference to limit raw video exposure (Ahsani et al., 26 Dec 2025).
  • Robustness and Regulation
    • Multi-modal redundancy (wearables+radar+cameras) and fused classifier outputs mitigate single-sensor dropout; compliance with EuroNCAP and EU 2019/2144 real-time warning/alert mandates (Sini et al., 2023, Cañas et al., 29 Apr 2025).

6. Benchmarks, Datasets, and Quantitative Performance

Comprehensive, open datasets underpin DMS algorithm development, alongside comparative analysis and benchmarking (Ortega et al., 2020, Katrolia et al., 2021, Lin et al., 2024):

  • DMD Dataset: 41 h, 37 drivers, RGB/IR/depth, 3 synchronized views (face, hands, body), 93 classes, extensible VCD annotation; enables ≥90 % single-modal accuracy, 93.7 % multi-modal fusion in real-time (Ortega et al., 2020).
  • TICaM Dataset: Real/synthetic sequences, RGB/depth/IR, 8 in-cabin scenarios (driver, passenger, child/infant seats), 20 activities; multi-task segmentation, detection, pose (Katrolia et al., 2021).
  • Drive&Act: 9.6 million frames, RGB/IR/depth, 83 fine-grained action classes, vehicle-cabin multi-view (Lin et al., 2024).
  • Benchmarks and Evaluation:

7. Limitations, Open Challenges, and Future Directions

Persistent research gaps and improvement targets include sensor coverage, model generalization, privacy, and explainability (Halin et al., 2021, Cañas et al., 15 Mar 2025, Wang et al., 2024, Lin et al., 2024, Kielty et al., 2023, Kielty et al., 2023):

By integrating redundant, multimodal sensing with advanced feature extraction, fusion, and algorithmic reasoning, in-cabin DMS research is positioned to deliver reliable, real-time driver state estimation and safety monitoring for both current and next-generation vehicles.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (19)

Whiteboard

Topic to Video (Beta)

Follow Topic

Get notified by email when new papers are published related to In-cabin Driver Monitoring Systems (DMS).