PPG Heart Rate Monitoring
- Photoplethysmography is an optical method measuring blood volume changes to estimate heart rate non-invasively.
- It integrates diverse sensing modalities including contact, non-contact, and multi-site fusion to address motion artifacts.
- Robust algorithms—from classical signal processing to deep neural networks—enable accurate heart rate estimation on low-power devices.
Photoplethysmography-based heart rate monitoring refers to the use of optical techniques—primarily reflection- or transmission-mode photoplethysmography (PPG)—to estimate heart rate (HR) and related cardiovascular metrics in both clinical and consumer-grade wearable platforms. PPG enables non-invasive detection of blood volume changes within the microvasculature by measuring modulations in light absorption or scattering at the skin. The field spans a wide spectrum of hardware implementations and algorithmic methodologies, including contact and non-contact sensing, classical signal processing and contemporary deep learning, motion artifact suppression via sensor fusion, and robust deployment on low-power wearable devices.
1. Principles and Physiological Basis
A PPG system consists of a light source (often green or infrared LED) and a photodiode. The LED illuminates the skin, and variations in subcutaneous blood volume modulate the amount of light transmitted or reflected back to the photodetector. The resultant analog PPG waveform contains an AC component, synchronous with the cardiac cycle, superimposed on a slowly varying DC component attributed to static tissue absorption and venous blood volume. Heart rate is inferred by determining the time intervals between successive systolic peaks or valleys in the PPG waveform: where is the distance in seconds between consecutive detected peaks.
This signal can be obtained with skin-contact optical sensors (as in wrist-worn wearables), via camera-based imaging (remote PPG/rPPG), or through hybrid approaches such as photoplethysmographic imaging (PPGI) in both reflectance and transmittance modes (Amelard et al., 2015, Protopopov, 4 Feb 2025).
2. Signal Acquisition Modalities
Contact Sensing
Reflective PPG sensors dominate consumer wearables such as wristbands and smartwatches. LEDs and photodiodes are co-located, and the sensor is typically placed in firm contact with the skin. Transmission-mode PPG, in which light passes through a peripheral site (e.g., finger), remains prevalent in clinical pulse oximeters but less practical for continuous ambulatory use.
Non-contact and Imaging PPG
Non-contact PPGI and remote PPG (rPPG) exploit standard RGB or monochrome cameras to extract pulsatile components from spatially-resolved image sequences (Amelard et al., 2015, Protopopov, 4 Feb 2025, Gudi et al., 2019, Pourbemany et al., 2021, Yang et al., 2021). Camera-based methods can function at distances ranging from centimeters to several meters and enable continuous multi-site measurement.
Multi-Site PPG Fusion
Distributed body sensor networks, leveraging multiple reflective PPG modules positioned at disparate anatomical sites (e.g., wrist, head, ankle, sternum), exploit the decorrelated artifact processes to reconstruct robust composite cardiac signals via dynamic quality-driven sensor fusion (Meier et al., 23 Dec 2024).
3. Algorithmic Approaches to Heart Rate Estimation
Classical and Signal Processing Approaches
Band-Pass Filtering and Peak Detection
Standard pipelines sequentially apply a band-pass filter (e.g., 0.5–5 Hz Butterworth) to the raw PPG, then detect peaks or valleys (often maxima in the AC component) to infer inter-beat intervals (Zhang et al., 2023, Tarniceriu et al., 2017, Amelard et al., 2015). Adaptive thresholding and local normalization are commonly employed to compensate for amplitude and baseline wander.
Time–Frequency Analysis and Harmonic Models
Complex activity and motion introduce spectral overlap between cardiac and artifact subspaces, motivating advanced spectral approaches:
- Joint Sparse Spectrum Reconstruction (JOSS): Simultaneous modeling of PPG and accelerometer spectra using an MMV sparse recovery framework. The cardiac peak is isolated by suppressing spectral components shared between PPG and accelerometer channels, producing high-fidelity tracking under heavy motion (MAE ≈1.28 BPM) (Zhang, 2015).
- Harmonic Sum (HSUM) Model: Represents both heart-beat and motion artifact as sums of harmonics. Fundamental frequencies are grid-searched and fitted via least squares, yielding robust estimates even during intense activity, with reported MAE ≈0.74 BPM (Dubey et al., 2016).
Adaptive Filtering and Subspace Methods
Motion artifacts are often highly correlated with inertial measurements. Multi-stage adaptive noise cancellation using RLS or LMS filters, cascaded by accelerometer axes, can substantially reduce artifact power in wrist PPG (Islam et al., 2017). Singular spectrum analysis (SSA) further decomposes the PPG into components, discarding those synchronized with accelerometer spectra.
Machine Learning and Deep Learning Techniques
Shallow and Model-Based ML
Feature-driven regressors—decision trees, random forests, multi-layer perceptrons—operate on compact time-domain summaries (e.g., a history of rough HR estimates) and require minimal computation, achieving sub-5% MAPE at 25 Hz sampling rates with models under 15 kB (Zhang et al., 2023).
Deep Neural Models: Device Agnostic and Adaptive
PPGNet applies a hybrid convolutional and recurrent network (CNN+LSTM) to 8 s PPG windows, achieving cross-validated MAE ≈3.36 BPM with transfer learning adaptation for device-specific calibration (A et al., 2019). Modern TCNs (temporal convolutional networks) with NAS-derived configurations, often fusing PPG and accelerometer data, deliver robust HR estimation (MAE 4–5 BPM) with resource footprints suitable for microcontroller deployment (Burrello et al., 2022, Burrello et al., 2022, Kasnesis et al., 2022). Explainable attention-based models (e.g., PULSE) deploy multi-head cross-attention to dynamically suppress motion artifacts, providing both accuracy and interpretability (Kasnesis et al., 2022).
Contact-Pressure Compensation
Deep generative approaches such as CP-PPG address suboptimal skin–sensor contact, learning to synthesize undistorted PPG morphology from pressure-distorted waveforms via adversarial autoencoding with custom morphology-aware loss. MAE reductions up to 40% (relative) in HR estimation have been demonstrated (Hung et al., 3 Apr 2025).
Accelerometer-Free Supervised Learning
Supervised NNs trained on hand-crafted PPG spectro-temporal features can achieve sub-2 BPM accuracy in intense exercise without reliance on inertial sensing. These approaches capitalize on robust peak morphology criteria to discriminate cardiac from artifact peaks (Essalat et al., 2020).
Remote and Imaging PPG Techniques
rPPG methods extract spatial–temporal pulse trains from facial color modulation using face tracking, ROI selection, chrominance–projection or ICA-based signal separation, and frequency/time-domain filtering (Pourbemany et al., 2021, Gudi et al., 2019, Yang et al., 2021). Current state-of-the-art pipelines combine skin-tone–invariant color projections (e.g., POS, CHROM) with motion suppression via spectral subtraction of head–motion cues, delivering HR and HRV estimates comparable to contact PPG in stable lighting (Gudi et al., 2019). Deep rPPG models (DeepPhys, rPPGNet, PhysNet) offer competitive performance in controlled conditions but are outperformed by classical chrominance-based methods under varying illumination unless augmented with extensive brightness variability during training (Yang et al., 2021).
4. Motion Artifact and Error Detection
Motion artifacts remain a primary source of estimation error, especially in wrist-worn reflectance PPG under free-living or exercise conditions. Data fusion techniques, including joint accelerometry, multi-site redundancy, and attention-based neural fusion, have been shown to improve robustness (Kasnesis et al., 2022, Meier et al., 23 Dec 2024).
A recent advance is the deployment of real-time warning systems directly on wearable devices to detect and communicate PPG–HR inaccuracies without access to raw waveforms. A rolling 1D-CNN operating on 10-second HR windows predicts estimation error, and a user-facing interface communicates reliability via a color-coded indicator, achieving >80% detection of large (>20 BPM) errors across multiple commercial devices (Islmabouli et al., 27 Aug 2025).
5. Validation, Performance Metrics, and Device Deployment
Hardware implementations are increasingly constrained by energy, memory, and processing limits. Temporal convolutional networks generated via NAS and quantized to low-bit precision deliver efficient on-device inference, with an integer 8 network (MAE = 4.41 BPM) occupying 412 kB and consuming 47.65 mJ per inference on an STM32WB55 MCU (Burrello et al., 2022); a smaller network achieves MAE < 8 BPM at 1.9 kB and 1.79 mJ. Accuracy on benchmarks such as PPG-DaLiA, IEEE SPC2015, and others is typically characterized by mean absolute error (BPM), MAPE, standard deviation, and Pearson correlation, along with Bland–Altman analysis for clinical acceptability.
Non-contact modalities, including PPGI and rPPG, report HR MAEs of ≈1–2 BPM in lab and realistic settings, though performance degrades under strong illumination variability or subject motion unless compensated by specialized preprocessing (e.g., ambient correction, spectral subtraction, or robust ROI tracking) (Amelard et al., 2015, Protopopov, 4 Feb 2025, Gudi et al., 2019, Yang et al., 2021).
6. Limitations and Future Directions
Persistent challenges include artifact resilience under strong motion, multi-illumination and multi-skin tone invariance, consistent performance across hardware variants, and energy-aware adaptation for continuous monitoring. Promising directions include:
- Real-time, interpretable reliability feedback to the user (Islmabouli et al., 27 Aug 2025);
- Multi-site PPG sensor fusion, exploiting segmentwise quality estimation for dynamic weighting (Meier et al., 23 Dec 2024);
- Deep learning pipelines with data augmentation to enhance robustness across lighting conditions (Yang et al., 2021);
- Contact-pressure–aware morphology restoration for improved waveform fidelity (Hung et al., 3 Apr 2025);
- Hardware-aware architecture search and aggressive quantization for on-wrist deployment (Burrello et al., 2022, Burrello et al., 2022).
A plausible implication is that longitudinal calibration and personalization, with adversarial or self-supervised domain adaptation, will be necessary to close the gap between laboratory-grade and real-world ambulatory PPG-based heart rate estimation. The field continues to balance the trade-offs between algorithmic complexity, interpretability, resource efficiency, and artifact robustness.