A-THENA: Early Intrusion Detection for IoT with Time-Aware Hybrid Encoding and Network-Specific Augmentation

Published 23 Apr 2026 in cs.CR and cs.LG | (2604.21623v1)

Abstract: The proliferation of Internet of Things (IoT) devices has significantly expanded attack surfaces, making IoT ecosystems particularly susceptible to sophisticated cyber threats. To address this challenge, this work introduces A-THENA, a lightweight early intrusion detection system (EIDS) that significantly extends preliminary findings on time-aware encodings. A-THENA employs an advanced Transformer-based architecture augmented with a generalized Time-Aware Hybrid Encoding (THE), integrating packet timestamps to effectively capture temporal dynamics essential for accurate and early threat detection. The proposed system further employs a Network-Specific Augmentation (NA) pipeline, which enhances model robustness and generalization. We evaluate A-THENA on three benchmark IoT intrusion detection datasets-CICIoT23-WEB, MQTT-IoT-IDS2020, and IoTID20-where it consistently achieves strong performance. Averaged across all three datasets, it improves accuracy by 6.88 percentage points over the best-performing traditional positional encoding, 3.69 points over the strongest feature-based model, 6.17 points over the leading time-aware alternatives, and 5.11 points over related models, while achieving near-zero false alarms and false negatives. To assess real-world feasibility, we deploy A-THENA on the Raspberry Pi Zero 2 W, demonstrating its ability to perform real-time intrusion detection with minimal latency and memory usage. These results establish A-THENA as an agile, practical, and highly effective solution for securing IoT networks.

Abstract PDF Upgrade to Chat

Authors (6)

Summary

The paper introduces continuous timestamp encoding to enhance early detection of IoT intrusions by capturing temporal dynamics in packet flows.
The paper reports up to 11 percentage points improvement over non-time-aware baselines and achieves high accuracy with extremely low false alarm rates.
The paper demonstrates real-time deployment on resource-constrained devices, ensuring practical applicability in diverse IoT environments.

Early Intrusion Detection for IoT with Time-Aware Hybrid Encoding: A Comprehensive Analysis of A-THENA

Introduction

A-THENA ("A-THENA: Early Intrusion Detection for IoT with Time-Aware Hybrid Encoding and Network-Specific Augmentation" (2604.21623)) addresses critical limitations in network intrusion detection for Internet of Things (IoT) environments by decoding the temporal dynamics of packet sequences, achieving low-latency early threat detection, and adapting to practical deployment constraints. The system introduces a principled framework grounded in continuous-time encodings for sequential data, integrating several novel technical modules within a unified, computationally efficient pipeline.

System Design and Methodological Innovations

A-THENA employs a lightweight Transformer-based encoder, ingesting raw byte sequences from network flows, together with their associated time-of-arrival information. Crucially, it deploys generalized Time-Aware Hybrid Encoding (THE) mechanisms, which directly utilize packet timestamps rather than discrete sequence indices. This replacement of positional integer indices with real-valued timestamps in sinusoidal, Fourier, and rotary positional encodings enables robust modeling of non-uniform temporal structures (Figure 1).

Figure 1: The system architecture: raw packet flows and timestamps are jointly encoded, with three time-aware modules (TA Sinusoidal, TA Fourier, TA RoPE) and a hybrid selection mechanism optimized via Early Detection Loss.

THE operates as a dynamic model selection layer—learning, per deployment or per dataset, which time-aware encoding provides the lowest validation loss (i.e., superior early detection characteristics under current threat distributions). A sophisticated Network-Specific Augmentation pipeline enables realistic data diversity, tailoring both offline (subflow generation, hybrid oversampling) and online perturbation (timestamp jitter, traffic scaling, insertion/removal/noise of packet bytes) at train-time, and directly targeting the performance regime relevant to resource-constrained, class-imbalanced, and earliness-critical settings (Figure 2).

Figure 2: Modular training and evaluation pipeline, including offline/online augmentation, THE variant selection, and integration of EDL optimization.

Early detection is further enforced through an Early Detection Loss (EDL) objective, a weighted cross-entropy that exponentially penalizes errors on shorter flows, thereby inducing a predictive bias toward correct classification under minimal packet exposure.

Technical Advances in Temporal Encoding

The core contribution centers on substituting order indices with continuous timestamp vectors within all encoding mechanisms. Figure 3 compares the representational dynamics achieved by classical (index-based) and A-THENA's time-aware positional modules across attack/benign/stealthy flows, showing the semantic separation and expressive range unlocked by directly encoding timing irregularities.

Figure 3: Standard index-based vs. time-aware positional encodings (Sinusoidal, Fourier, RoPE) reveal starkly improved discrimination of SSH brute-force, benign, and backdoor malware traffic.

The theoretical justification is rooted in the hypothesis that, for non-uniform time series (as is typical with network traffic), timing gaps, bursts, and delays carry discriminative information for attack signatures that is fully obfuscated by conventional discrete-sequence encodings.

Experimental Analysis and Numerical Results

A-THENA is evaluated on three heterogeneous IoT intrusion benchmarks (CICIoT23-WEB, MQTT-IoT-IDS2020, IoTID20) spanning web, protocol, and botnet-scale attacks, with tasks configured for multiclass detection (6, 5, and 9 classes, respectively). Across all datasets, the hybrid time-aware encoding outperforms all non-time-aware baselines by 4.58–11.06 percentage points and all prior time-aware encodings (GTID, FATA, Time2Vec, CTLPE, ChronoFormer, PEA) by 3.69–6.17 points. The system achieves 100% accuracy, 0% False Alarm, and 0% False Negative rates on the two more regular datasets, and 93.83% accuracy on the most complex one, while making confident predictions as early as the first observed packet in most scenarios (Table analysis, see paper).

Critically, the hybrid THE mechanism automatically adapts to select the empirically optimal encoding for each threat environment, a capability clearly documented in the cross-validation studies.

Figure 4: Prediction confidence trajectories: A-THENA achieves rapid, high-confidence classification from minimal packet count, with stability not observed in index-based models.

Analysis of accuracy-earliness tradeoff shows monotonic improvements with threshold choice, with robust performance retained even under aggressive earliness constraints. Figure 5 further illustrates the effect of stringent confidence thresholds on earliness and full-flow coverage.

Figure 5: Parameter sweep of confidence threshold: Moderate tau yields optimal coverage and speed; extreme tau raises risk of delayed classification.

Visualization of self-attention matrices (Figure 6) reveals that time-aware encoding induces marked attention realignment, leading to interpretable and attack-specific focus in the learned model, further justifying the technical approach.

Figure 6: Temporal attention heatmaps: TA RoPE produces structured, temporally localized attention, in contrast to diffuse non-time-aware models.

Resource-Constrained Inference and Practical Deployment

Deployment on Raspberry Pi Zero 2 W demonstrates real-time feasibility (average inference latency 1.4 ms, 40 KB on-disk model size, 3.25 MB memory footprint), substantially undercutting both feature-based methods (which suffer from $\sim$ 664 ms per-flow preprocessing delay) and parameter-heavy deep models. Aggressive INT8 quantization provides minor additional latency reduction, with diminishing returns due to the already minimal model footprint.

Component Analysis and Ablation

Ablation experiments (Figure 7) show that removing time-aware augmentation, EDL, or deploying naive cross-entropy loss uniformly degrades both accuracy and earliness in all settings—demonstrating the complementarity of all major A-THENA components.

Figure 7: Ablation study: Disablement of time-aware augmentation or EDL increases FAR/FNR and ERDE; THE loss and augmentation are non-redundant for optimal performance.

Implications and Future Research Directions

A-THENA defines a new state-of-the-art reference architecture for EIDS in constrained IoT contexts. Its principal implication is that timestamp-driven representations are strictly superior for attack detection in temporally structured network environments, and that combining multiple temporal parameterizations + model selection yields cross-domain robustness. The system's raw traffic, feature-free modeling approach removes domain-specific bottlenecks and enables full exploitation of packet-level nuances.

The system remains dependent on accurate timestamp extraction and requires future hardening against adversarial timing perturbations. Extensions should assess reinforcement under timestamp noise, adversarial temporal transformations, and vertical/horizontal scaling to broader IoT topology and protocol diversity.

Conclusion

A-THENA systematically advances early intrusion detection in IoT by unifying temporal-aware Transformer encoding, targeted augmentation, and early-detection-centric optimization in a practical, light-footprint system. The framework’s robust cross-dataset results, interpretability, and hardware efficiency establish it as a practical foundation for IoT threat monitoring. The research opens a path for broader adoption of continuous-time sequence modeling in cybersecurity and other temporally irregular sequential domains.

Markdown Report Issue