Papers
Topics
Authors
Recent
Search
2000 character limit reached

SHIELD: Securing Healthcare IoT with Efficient Machine Learning Techniques for Anomaly Detection

Published 5 Nov 2025 in cs.LG | (2511.03661v1)

Abstract: The integration of IoT devices in healthcare introduces significant security and reliability challenges, increasing susceptibility to cyber threats and operational anomalies. This study proposes a machine learning-driven framework for (1) detecting malicious cyberattacks and (2) identifying faulty device anomalies, leveraging a dataset of 200,000 records. Eight machine learning models are evaluated across three learning approaches: supervised learning (XGBoost, K-Nearest Neighbors (K- NN)), semi-supervised learning (Generative Adversarial Networks (GAN), Variational Autoencoders (VAE)), and unsupervised learning (One-Class Support Vector Machine (SVM), Isolation Forest, Graph Neural Networks (GNN), and Long Short-Term Memory (LSTM) Autoencoders). The comprehensive evaluation was conducted across multiple metrics like F1-score, precision, recall, accuracy, ROC-AUC, computational efficiency. XGBoost achieved 99\% accuracy with minimal computational overhead (0.04s) for anomaly detection, while Isolation Forest balanced precision and recall effectively. LSTM Autoencoders underperformed with lower accuracy and higher latency. For attack detection, KNN achieved near-perfect precision, recall, and F1-score with the lowest computational cost (0.05s), followed by VAE at 97% accuracy. GAN showed the highest computational cost with lowest accuracy and ROC-AUC. These findings enhance IoT-enabled healthcare security through effective anomaly detection strategies. By improving early detection of cyber threats and device failures, this framework has the potential to prevent data breaches, minimize system downtime, and ensure the continuous and safe operation of medical devices, ultimately safeguarding patient health and trust in IoT-driven healthcare solutions.

Summary

  • The paper introduces the SHIELD framework that integrates efficient ML models (notably XGBoost and KNN) for healthcare IoT anomaly detection, achieving accuracies up to 99%.
  • The paper benchmarks eight models across supervised, semi-supervised, and unsupervised paradigms, highlighting that supervised approaches offer superior performance and computational efficiency.
  • The paper details rigorous feature engineering and selection strategies to ensure precise detection of both device malfunctions and cyberattacks in resource-constrained healthcare settings.

SHIELD: Efficient Machine Learning for Anomaly Detection in Healthcare IoT

Introduction

The proliferation of IoT devices in healthcare environments has introduced substantial security and reliability challenges, necessitating robust anomaly detection mechanisms to safeguard patient safety and system integrity. The SHIELD framework addresses these challenges by integrating efficient machine learning techniques for the detection of both cyberattacks and faulty device anomalies in healthcare IoT systems. The framework leverages a large-scale dataset (200,000 records) from an ICU setting, encompassing both device-level operational data and network-level attack data, and systematically evaluates eight machine learning models across supervised, semi-supervised, and unsupervised paradigms.

Methodology

Data Acquisition and Preprocessing

The dataset comprises two primary categories: faulty device data and attack data. Device data includes physiological measurements (temperature, blood pressure, heart rate, battery level) and control parameters, while attack data captures network traffic metadata, TCP/MQTT protocol flags, and temporal features. Preprocessing involves median imputation for missing values, one-hot encoding for categorical variables, and appropriate scaling (StandardScaler for traditional ML, MinMax for deep learning models).

Feature engineering introduces domain-specific metrics such as Heart Rate Deviation (HRD), Blood Pressure Deviation (BPD), and TCP Anomaly Score, enhancing the model's ability to detect subtle anomalies. Feature selection is performed using ANOVA F-value, Mutual Information, and Recursive Feature Elimination, resulting in an optimal subset of features for each detection task.

Model Architectures and Training

Eight models are implemented:

Each model is tuned with task-specific hyperparameters. Supervised models utilize stratified train-test splits or full labeled datasets, while semi-supervised and unsupervised models rely on reconstruction errors or decision boundaries for anomaly detection. The comprehensive evaluation across paradigms mitigates overfitting and ensures robustness against statistical artifacts.

Experimental Results

Faulty Device Anomaly Detection

  • XGBoost achieves 99% accuracy, perfect precision and recall, and minimal computational cost (0.04s), outperforming all other models.
  • KNN also demonstrates high accuracy and efficiency, making it suitable for real-time deployment.
  • Isolation Forest and GAN provide near-perfect recall and ROC-AUC, but GAN incurs significantly higher computational cost.
  • VAE offers balanced performance (97% accuracy), while One-Class SVM is lightweight but less precise.
  • LSTM Autoencoder and GNN underperform in accuracy and computational efficiency, limiting their applicability for time-sensitive tasks.

Healthcare Cyberattack Detection

  • KNN achieves near-perfect precision, recall, and F1-score (99%) with the lowest computational cost (0.05s).
  • XGBoost closely follows, excelling in all metrics.
  • VAE and Isolation Forest are competitive, with VAE reaching 97% accuracy.
  • GAN exhibits the lowest accuracy (83%) and ROC-AUC (0.72), indicating limitations in adversarial training for this domain.
  • LSTM Autoencoder and GNN show high ROC-AUC but suffer from low precision/recall and high latency, making them impractical for real-time detection.

Comparative Analysis

Supervised models consistently outperform semi-supervised and unsupervised approaches in both detection tasks, particularly when labeled data is available and anomaly patterns are well-defined. The computational efficiency of XGBoost and KNN enables real-time anomaly detection on resource-constrained edge devices, while the modularity of SHIELD supports integration with cloud-based security solutions.

Implications and Future Directions

The SHIELD framework demonstrates that supervised learning is optimal for mission-critical healthcare IoT anomaly detection, given its superior accuracy and efficiency. The scalability of SHIELD allows deployment across diverse healthcare environments, from small clinics to large hospital networks. The framework's modular design facilitates integration with federated learning for privacy-preserving collaborative training and supports future extensions with time-series models and reinforcement learning for adaptive threat detection.

The results suggest that adversarial training (GAN) is less effective for healthcare IoT anomaly detection, likely due to the complexity and heterogeneity of medical device data. Conversely, tree-based and instance-based supervised models (XGBoost, KNN) are highly reliable, especially when feature selection is rigorously applied.

Future research should focus on enhancing SHIELD with streaming analytics for low-latency decision-making, expanding its applicability to high-throughput healthcare networks, and exploring hybrid models that combine the strengths of supervised and unsupervised paradigms. The integration of explainable AI techniques will further improve trust and transparency in clinical settings.

Conclusion

SHIELD provides a comprehensive, scalable, and computationally efficient solution for anomaly detection in healthcare IoT systems. Supervised learning models, particularly XGBoost and KNN, deliver the best combination of accuracy and speed, making them ideal for real-time deployment in mission-critical environments. The framework's adaptability and modularity position it as a robust foundation for securing next-generation smart healthcare infrastructure, with significant potential for future enhancements in privacy, scalability, and adaptability.

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We found no open problems mentioned in this paper.

Collections

Sign up for free to add this paper to one or more collections.