Virtual Sensor Models
- Virtual sensor models are mathematical constructs that estimate inaccessible physical variables using physics-based, machine learning, or hybrid techniques.
- They leverage state observers, neural networks, and Kalman filters to integrate sensor data for robust, real-time decision-making and diagnostics.
- These models find practical applications in automotive control, industrial monitoring, robotics, and IoT, enhancing efficiency and reducing measurement costs.
A virtual sensor is a mathematical or algorithmic construct that infers, estimates, or synthesizes the value of a physical quantity or system parameter that cannot be directly measured or is impractical to measure in real time or at acceptable cost. Virtual sensor models are deployed across diverse application domains—including automotive control, industrial process monitoring, robotics, structural health, and IoT—by leveraging physics-based, data-driven, or hybrid methodologies. These models integrate real sensor measurements, system knowledge, and statistical or learning algorithms to generate reliable proxies for otherwise unavailable variables, often supporting real-time decision-making, control, or diagnostics.
1. Fundamental Principles and Classifications
Virtual sensors are broadly classified by their modeling paradigm and system integration approach.
- Physics-based models: Employ physical laws and mechanistic models (e.g., tire models in vehicles, observer-based state estimation). Adaptive model-based observers and parameter identification are typical, such as using adaptive Magic Formula tire models combined with model-based observers for sideslip estimation (Singh, 2018).
- Data-driven models: Leverage statistical learning, artificial neural networks (ANNs), support vector machines, or other machine learning techniques to directly map available sensor readings to the estimated quantity. Architecture examples include feedforward NNs for engine parameter estimation and end-to-end deep learning models for sensor signal emulation (Rastogi et al., 2017, Elmadawi et al., 2019).
- Hybrid approaches: Combine physical-model-based state observers with statistical or ML post-processing, e.g., banks of linear observers feeding residuals to an ANN or random forest regressor, synthesizing an estimate of an unmeasurable scheduling parameter (Masti et al., 2021, Previtali et al., 6 Aug 2025).
Key types of virtual sensors include:
- State estimators: Infer unmeasurable dynamic states (e.g., vehicle sideslip, battery SoC, robot tip position).
- Soft sensors: Emulate difficult-to-access or expensive sensor readings, commonly for control or monitoring (e.g., process setpoints, environmental parameters).
- Sensor fusion virtual sensors: Compute optimal estimates (e.g., via generalized least squares or Kalman filtering) from physically redundant arrays, taking explicit account of cross-sensor dependencies (Xu et al., 2019).
- Virtual fusion models: Train single-sensor representations to serve as surrogates for multi-sensor fusion via contrastive learning or other alignment objectives (Nguyen et al., 2023).
2. Mathematical Foundations and Architectures
Virtual sensor models are instantiated via a variety of mathematical structures, typically optimized during a design or identification phase using a dataset in which all relevant inputs and (temporarily available) targets are measured.
- Feedforward neural networks (NN): For regression tasks, NNs take as inputs a vector of measured signals and output the virtual sensor value. Hyperparameters (hidden-layer size, activation functions, regularization) are chosen by grid search and validation on held-out data, with weights tuned via training algorithms such as Levenberg–Marquardt or Bayesian regularization (Rastogi et al., 2017).
- Partial Least Squares (PLS) regression: Dimensionality reduction and regression are jointly achieved by maximizing the covariance between latent projections of high-dimensional process sensor data (X) and outputs (Y). Linear PLS, polynomial PLS, and NN-PLS (neural-network approximation of PLS components) support flexible modeling in semiconductor process monitoring (0706.0465).
- Bank of state observers: For parameter-/mode-varying systems, local ARX (AutoRegressive with eXogenous input) models are fit for representative operating points; each is converted to a state observer (typically in observer-canonical form), and the bank is executed in parallel. The innovation (residual) features extracted from these observers are then mapped to the virtual variable by a regressor (e.g., shallow ANN or random forest). This combines robust filtering with discriminative prediction and has demonstrated success in parameter-varying system state/parameter estimation, battery SoC, and mode detection (Masti et al., 2021, Previtali et al., 6 Aug 2025).
- Extended Kalman Filtering (EKF), with or without virtual sensor augmentation: Virtual sensor outputs are integrated as pseudo-measurements alongside classical observables, with noise covariance matrices calibrated (often via black-box optimization), ensuring robust denoising and filtering of the estimate (Previtali et al., 6 Aug 2025).
A summary table of core architectures:
| Model type | Core methodology | Example domains |
|---|---|---|
| NN-based | Feedforward/MLP, CNN, GNN | Engine sensors, HAR, LiDAR |
| Model-based | Observer, Kalman filter | Sideslip, SoC, robot tip pos. |
| Hybrid | Observer bank + ML | Parameter-varying, SoC, FDC |
| PLS-based | Linear/poly/NN PLS | Semiconductor FDC |
3. Identification, Training, and Calibration
The identification or training phase exploits time-synchronized measurements of both the desired virtual variable(s) and the candidate input variables under sufficiently rich excitation.
- Dataset construction: Instrumentation of system with both low-cost (operational) and high-cost (instrumentation, only for modeling) sensors. Input/target pairs are extracted, often using windowed features or full temporal context (Tsuji et al., 29 May 2025).
- Data-driven feature extraction: For cases with strongly nonlinear dependencies or mode-dependent dynamics, extracted features can include observer residuals, spectral moments, or concatenated raw measurements over a window (Masti et al., 2021).
- Learning/fitting: Model parameters are tuned to minimize task-specific metrics—typically mean squared error (MSE), mean absolute error (MAE), or classification loss—with training terminated by early stopping or based on validation set performance. Regularization, dropout, and grid/random search for hyperparameters are widely adopted (Rastogi et al., 2017, Previtali et al., 6 Aug 2025).
- Parameter adaptation: In application to changing environments, recursive schemes (e.g., online gradient adaptation of tire model parameters) can be used to maintain model tracking (Singh, 2018).
Calibration against ground-truth and adjustment of normalization or units is essential for comparability and operational integration. For observer-based fusion, optimal weighting (e.g., via covariance inversion) ensures minimum-variance estimation (Xu et al., 2019).
4. Performance Evaluation and Benchmarking
Performance metrics vary according to end-use:
- Regression (continuous value) tasks: MSE, RMSE, R², range of accuracy, and percentage of test points meeting ± error bounds (e.g., ≥99% accuracy on all test samples) are standard (Rastogi et al., 2017, 0706.0465).
- Classification: F1-score, accuracy, and class-by-class metrics (for, e.g., human activity recognition) are used, and virtual fusion models can sometimes even surpass physical sensor-fusion baselines (Nguyen et al., 2023).
- Real-time requirements: Model evaluation must be compatible with hardware constraints: real-time inferential latency (<1 ms per inference) and reasonable memory budget (order 100 kB) for embedded deployment are routinely demonstrated (Masti et al., 2021, Previtali et al., 6 Aug 2025).
- Experimental design: Validation on held-out or temporally separated datasets, including adverse/unseen conditions, is used to verify generalizability. In process control, “outlier detection” logic compares virtual sensor outputs to recipe setpoints or predicted physical property bounds to enable fault detection (0706.0465).
- Physical realism in synthetic models: For virtual sensors simulating physical sensor signatures (e.g., LiDAR, tactile sensors), quantitative comparison to real sensor distributions, e.g., per-point MSE/RMSE, distributional KL divergence, mode overlap, or geometric reconstruction accuracy (e.g., sub-millimeter RMSE in 3D reconstruction) (Elmadawi et al., 2019, Haroon et al., 18 Sep 2025, Leins et al., 14 Jan 2025).
5. Applications and Impact
Virtual sensor models are integral to a wide range of application sectors:
- Automotive: Sideslip angle estimation, oil pressure emulation, and fault detection/setpoint verification (Chassis/ECM) (Singh, 2018, Rastogi et al., 2017, 0706.0465).
- Industrial process control: Soft sensors for wafer-state, recipe setpoints, process FDC in plasma etching reactors as critical enablers for yield and quality (0706.0465).
- Robotics and automation: High-cost sensing emulation (e.g., tip position, tactile proxies), either for real-time NMPC or in simulation-to-real transfer (Tsuji et al., 29 May 2025, Leins et al., 14 Jan 2025).
- Battery management: Nonlinear or operating-point-dependent SoC estimation, with hybrid observer–ML fusion providing best tradeoffs of accuracy and smoothness compared to either filter-only or pure learning methods (Previtali et al., 6 Aug 2025, Masti et al., 2021).
- Structural/prognostic health: Virtual sensors for load and stress, leveraging spatial–temporal modeling of distributed, heterogeneous signals (e.g., HTGNN for bearing load prediction) (Zhao et al., 2024).
- Human activity recognition: Virtual fusion models enable single-sensor systems to benefit from the effective information content of broader sensor suites, yielding performance commensurate with or exceeding classical fusion (Nguyen et al., 2023); virtual data generation (AgentSense) simulates labeled sensor streams for robust HAR model training (Leng et al., 13 Jun 2025).
6. Advanced Topics and Future Directions
Major trends in current research and prospective extensions include:
- Generalization and digital twin workflows: Physics-based rendering and sensor simulation frameworks (e.g., VIRTUS-FPP in Isaac Sim) enable accurate digital-twin workflows, supporting rapid prototyping, calibration, and scenario benchmarking with validation against real data (Haroon et al., 18 Sep 2025).
- Heterogeneous, spatio-temporal learning: Graph-based models (HTGNN) explicitly model sensor network topology and heterogeneous signals, outperforming homogeneous deep learning baselines in complex industrial systems (Zhao et al., 2024).
- Adversarial settings and side-channel synthesis: Virtual sensors are now considered for security-critical verification (e.g., synthesizing virtual IMUs from camera side channels to fortify device authentication) (Long et al., 2023).
- Integration in real-time feedback control: Robust, high-throughput implementations (optimized observer–ML loops, parallelized Monte-Carlo) close the loop between measurement and actuator in applications from plasma IED control to battery management (Bogdanova et al., 2020, Previtali et al., 6 Aug 2025).
- Middleware and cloud/fog integration: The formalization of Virtual Sensor constructs as first-class entities (e.g., in fog-cloud IoT platforms) supports scalable, reconfigurable, and fault-tolerant distributed dataflows, with well-defined aggregation, imputation, and monitoring logic (AlMahamid et al., 2022).
- Limitations: Open challenges include model adaptation under abrupt operational changes (e.g., rapid friction shift on icy roads), computational costs for real-time complex model evaluation, generalization to new operational regimes, and the necessity of substantial labeled data for robust ML-driven virtual sensors. Simulation models often require careful calibration against hardware to avoid systematic bias (Singh, 2018, Haroon et al., 18 Sep 2025, Leins et al., 14 Jan 2025).
7. Representative Example Implementations
A selection of prominent virtual sensor model instantiations:
- Feedforward NN for diesel-engine oil pressure: MSE <1 kPa; 100% of test points within 99% accuracy; Bayesian regularization and AWB coefficient search further reduce the prediction range by up to 23% (Rastogi et al., 2017).
- Hybrid observer–ML virtual sensors for SoC estimation: Fused virtual sensor + EKF achieves lowest combined error and smoothness (SOC-RMSE ≈ 0.019, TV ≈ 0.9×10⁻³), outperforming both baseline EKF and standalone ML virtual sensor (Previtali et al., 6 Aug 2025).
- HTGNN for bearing load: Mean absolute percent error (MAPE) of 4.5% (axial) and 5.7% (radial) on seen conditions, effectively half that of CNN, and robust generalization to unseen operational scenarios (Zhao et al., 2024).
- Digital-twin fringe projection sensor: Absolute radial error of 0.512 mm (1.02%) in simulated reconstruction versus ground truth; cloud-to-mesh distances for real and virtual twins peak in 0–1 mm bin (Haroon et al., 18 Sep 2025).
Virtual sensor models now permeate industrial, automotive, robotics, process, and ambient intelligence domains, providing a rigorous, efficient, and resource-optimal approach to the estimation of critical but unmeasured quantities, leveraging both mechanistic insight and data-driven learning (Singh, 2018, Rastogi et al., 2017, Tsuji et al., 29 May 2025, Xu et al., 2019, Nguyen et al., 2023, Previtali et al., 6 Aug 2025, Zhao et al., 2024, Leins et al., 14 Jan 2025, Haroon et al., 18 Sep 2025, AlMahamid et al., 2022, Elmadawi et al., 2019, 0706.0465, Masti et al., 2021, Bogdanova et al., 2020, Leng et al., 13 Jun 2025, Long et al., 2023).