- The paper introduces a Tsetlin Machine-based IDS that leverages rule-based logic for highly interpretable IoMT security.
- The methodology employs binarized feature vectors and SMOTE balancing, achieving 99.5% accuracy in binary and 90.7% in multi-class scenarios.
- The system delivers real-time performance with sub-microsecond inference and transparent decision pathways ideal for resource-constrained medical devices.
Tsetlin Machine-driven Intrusion Detection for IoMT Security
Introduction to IoMT Security and IDS Challenges
The widespread integration of the Internet of Medical Things (IoMT) is significantly boosting healthcare efficiency, connectivity, and real-time medical data exchange. However, this digitalization amplifies cybersecurity risks, as IoMT devices process sensitive patient information and are particularly susceptible to network-based threats. Conventional rule-based and signature-driven IDS approaches in IoMT are largely static and struggle to generalize to novel or evolving attack vectors, exhibiting limited capacity for interpretability, scalability, and timely operation.
Machine learning-driven IDS solutions increasingly address these gaps, but prevailing techniques—including DNNs, LSTMs, and classical ensemble classifiers—are generally agnostic to logical reasoning and transparency. The black-box nature of deep models poses obstacles for regulatory compliance and user trust, especially in safety-critical healthcare contexts. This paper proposes a Tsetlin Machine (TM)-based IDS that focuses on interpretable, efficient, and logically grounded pattern recognition, specifically tailored for heterogeneous and resource-constrained IoMT environments.
Figure 1: A schematic demonstration of IoMT workflows in connected healthcare.
Tsetlin Machine Architecture for Interpretable IDS
The TM is a rule-based learning architecture based on Tsetlin Automata, which learns dense sets of propositional logic clauses over binarized feature vectors. For multi-class discrimination, each attack class is represented by multiple clauses with positive and negative polarities, aggregated into a class-specific vote score. Clause selection and specialization are modulated by threshold and specificity parameters (T, s), supporting compactness and granular logic formation. This discrete, interpretable formulation is well-matched to low-power medical devices and provides transparent decision-making pathways, in stark contrast to classical ML models.
Figure 2: Overall system architecture of the TM-driven IDS, detailing data preprocessing, clause learning, explainable inference, and deployment loops.
Preprocessing steps encompass data cleaning, missing value handling, SMOTE-assisted class balancing for training, and conversion of continuous features to binned binary representations, optimizing input for logical pattern extraction. At inference, TM outputs dominant class votes and exposes active clause sets, directly linking predictions to logic fragments derived from training data.
Experimental Design and Dataset Properties
Evaluations are conducted on CICIoMT24, a comprehensive public dataset engineered for holistic IoMT security benchmarking, encompassing Bluetooth, MQTT, and Wi-Fi network protocols and a spectrum of high-prevalence attack types (DoS, DDoS, Reconnaissance, MQTT-specific, Spoofing). Multi-protocol, multi-class settings mirror realistic hospital and telemedicine deployments.
Class imbalance—pervasive in security logs—is systematically addressed by SMOTE for the TM pipeline, while other model baselines adopt class-weighted strategies.
Figure 3: The pronounced class imbalance in Scenario 1 (binary classification) prior to balancing.
Figure 4: Post-SMOTE result—achieving balanced training class distributions for robust learning.
Scenario 1: Binary Classification (Bluetooth)
TM robustly distinguishes benign from DoS samples, achieving a 99.5% accuracy, with an F1-score of 0.995 and inference latency of 0.743 µs, aligning with or surpassing top ML baselines (DT, RF, XGBoost, LR). The confusion matrix displays a negligible false-positive rate (0.02%) and a sub-1% false-negative rate, emphasizing practical suitability for real-time IDS in latency-critical applications.
Figure 5: TM confusion matrix in the binary scenario (Benign vs. DoS), confirming high specificity and sensitivity.
Model interpretability is highlighted by class-wise voting and clause activation visualization: benign samples receive a strong positive vote margin, clearly supported by a predominance of activated positive clauses.
Figure 6: Visualized class-wise TM votes for a typical Benign sample—dominant positive evidence for correct prediction.
Figure 7: Clause activation map—Benign class displays highly active positive clauses endorsing the classification.
Scenario 2: Six-Class Multi-Class Classification (MQTT/Wi-Fi)
TM achieves 90.7% overall accuracy and F1-score of 0.906, outperforming traditional ML methods (e.g., XGBoost, LGBM, DNN) by a margin of up to 17 percentage points in accuracy over challenging classes. Despite increased class complexity, inference latency remains sub-5 µs. TM maintains robust performance in the presence of severe inter-class confusion between DoS and DDoS, attributed to similarity in volumetric characteristics.
Figure 8: Multi-class imbalance visualization (Scenario 2). An essential precursor to fair model evaluation.
Figure 9: SMOTE-restored balanced class composition in training.
Learning curves indicate absence of overfitting and proper model convergence.
Figure 10: Stable trajectory of TM training and testing accuracy versus epochs for multi-class setting.
The multi-class confusion matrix confirms dominant per-class diagonal elements, with contained misclassification.
Figure 11: TM confusion matrix, confirming high per-class accuracy in the six-class case.
Per-sample explainability remains clear, with vote and clause activation patterns tracing decisions back to concrete logic learned from malicious traffic signatures.
Figure 12: Class votes for a selected Recon sample—TM identifies the attack class unambiguously.
Figure 13: Clause activation heatmap—Recon sample produces activated logical clauses aligned with the attack pattern.
Scenario 3: Seven-Class Evaluation
Augmenting the task scope, TM preserves a performance lead, registering 88.4% accuracy and F1-score in a setting with maximum label diversity. Competing ML models (DNN, RF, LGBM, LR, NB) report up to 12% less accuracy and/or significant reductions in recall. Inference speed remains in the microseconds.
Figure 14: Stack plot of class imbalance in the seven-class joint protocol scenario.
Class-wise voting for both benign and DDoS samples in this compounded scenario continues to yield clearly separated outputs, supporting operational trust.
Figure 15: Vote distribution for a benign traffic sample.
Figure 16: Vote distribution for a DDoS attack sample.
Comparative Evaluation with State-of-the-Art and Explainability
Across all scenarios, TM delivers either near-parity or improved numeric outcomes versus state-of-the-art: multi-class performance with TM exceeds DNN, LSTM, and best published ML classifier benchmarks on CICIoMT24 by 9–18 points on accuracy in complex settings. This is achieved while ensuring per-decision transparency, traceability, and logic auditability—features insufficiently supported by prior deep learning or ensemble ML systems.
Broader Implications, Limitations, and Future Directions
The practical implications of this work are significant: TM-based IDSs can be efficiently executed on resource-limited medical nodes without reliance on cloud-scale resources, and each detection decision can be interpreted and audited through propositional logic, facilitating compliance and clinical acceptance.
Theoretically, this research demonstrates the necessity and feasibility of integrating logical clause learning into intrusion detection, addressing central challenges in class imbalance, adversarial robustness, and black-box opacity endemic to deep models.
For future development, key directions include: extension to real-world network traffic in physical testbeds, generation and open sourcing of broader attack corpora, further empirical evaluation on hardware-constrained devices, and hybridization with other symbolic or probabilistic reasoning frameworks for adversarial resilience.
Conclusion
This study establishes the efficacy and operational transparency of a TM-driven IDS for contemporary IoMT security, realizing notable advances in detection accuracy, interpretability, and deployment feasibility versus classical and deep ML alternatives. The TM approach uniquely satisfies the transparency and efficiency constraints required for critical medical applications and stands as a viable foundation for next-generation, explainable, and resource-efficient security paradigms in healthcare cyber-physical systems.
(2604.03205)