ML Defender (aRGus NDR): An Open-Source Embedded ML NIDS for Botnet and Anomalous Traffic Detection in Resource-Constrained Organizations

Published 3 Apr 2026 in cs.CR | (2604.04952v1)

Abstract: Ransomware and DDoS attacks disproportionately impact hospitals, schools, and small organizations that cannot afford enterprise security solutions. We present ML Defender (aRGus NDR), an open-source network intrusion detection system built in C++20, deployable on commodity hardware at approximately 150-200 USD. ML Defender implements a six-component pipeline over eBPF/XDP packet capture, ZeroMQ transport, and Protocol Buffers serialization, combining a rule-based Fast Detector with an embedded Random Forest classifier. The Maximum Threat Wins policy selects the arithmetic maximum of both scores, using ML inference to suppress false positives. Evaluated against the CTU-13 Neris botnet dataset: F1=0.9985, Precision=0.9969, Recall=1.0000, FPR=0.0002% (2 FP in 12,075 benign flows). The Fast Detector alone produces 6.61% FPR on benign traffic; the ML layer reduces this to zero -- a ~500-fold reduction. Per-class inference latency: 0.24-1.06 microseconds on commodity hardware. Under progressive load testing, the pipeline sustains ~34-38 Mbps with zero packet drops across 2.37 million packets. RAM stable at ~1.28 GB. The bottleneck is VirtualBox NIC emulation, not pipeline logic. All figures are conservative lower bounds; bare-metal characterization is future work. This work was developed through the Consejo de Sabios, a structured multi-LLM peer review methodology. Test-Driven Hardening (TDH) is proposed as a methodology for security-critical distributed systems. ML Defender is released under the MIT license.

Abstract PDF Upgrade to Chat

Authors (1)

Alonso Isidoro Román

Summary

The paper introduces ML Defender, an embedded ML-based NIDS/NDR that leverages a dual-path detection approach combining fast heuristic and Random Forest ML detectors.
It details a decoupled, zero-copy processing pipeline using Linux XDP, ZeroMQ, and Protocol Buffers, secured with ChaCha20-Poly1305 encryption, ensuring reproducibility on commodity hardware.
Experimental evaluation on the CTU-13 benchmark demonstrates an F1 score of 0.9985 and near-zero false positives, highlighting its potential for resource-constrained deployments.

ML Defender (aRGus NDR): An Embedded Open-Source ML-Based NIDS for Resource-Constrained Environments

Problem Setting and Research Motivation

ML Defender (aRGus NDR) addresses a persistently underserved domain: practical, deployable network intrusion detection and response for organizations lacking access to enterprise-grade security infrastructure (e.g., hospitals, schools, small businesses). Existing open-source NIDS solutions such as Snort and Suricata are grounded in signature- or rule-based detection and exhibit limited resilience to novel attacks and encrypted payloads. Machine learning-based approaches have gained traction in the literature, yet suffer from a persistent gap between experimental prototypes and operational systems, typically requiring non-trivial amounts of specialized hardware, operational complexity, or cloud inference, all of which are unsuitable for high-risk, budget-limited environments.

The core objective is the synthesis and evaluation of a fully embedded, open-source ML NIDS/NDR that operates on commodity hardware ($150–200 USD), with real-time packet classification and automated network response capabilities, all while maintaining a light operational and resource footprint.

System Architecture and Technical Contributions

The ML Defender pipeline comprises six fully decoupled stages connected via ZeroMQ, operating over Protocol Buffers serialization and secured with ChaCha20-Poly1305 authenticated encryption. Packet capture utilizes the Linux kernel’s XDP functionality for zero-copy operations, extracting network flows through a sharded flow manager, and feature computation over a 10-second sliding window.

The detection architecture is dual-path:

Fast Detector: A deterministic, rule- and threshold-based engine employing heuristics on metrics such as IP velocity, SMB connection count, port diversity, and RST ratio.
ML Detector: An embedded Random Forest, compiled directly to C++20 data structures, ingesting the same 28-feature vectors and providing continuous threat scores spanning ransomware, DDoS, regular traffic, and internal threats (the last via ONNX Runtime for differential latency and footprint characterization).

The “Maximum Threat Wins” policy (arithmetic max of both detectors’ confidences) is used rather than consensus or OR/AND-based decisions, optimizing for false positive suppression without sacrificing detection recall.

Additional architectural components include a semantic Observability subsystem implementing a local RAG pipeline (FAISS + TinyLlama), and an extensible plugin architecture with explicit Trusted Computing Base minimization, ABI stability, and strong post-invocation invariants for core security.

Security primitives employ cryptographically sound channel separation (distinct HKDF-SHA256 derivations per channel and direction), HMAC-SHA256 log validation, and forward compatibility hooks for dynamic group key negotiation (Noise Protocol Framework).

The pipeline deterministically produces identical decisions for identical packet streams, supporting rigorous reproducibility. All design artifacts, tests, and methodology (including all ADRs and test-driven hardening steps) are publicly available.

Experimental Evaluation and Numerical Results

Evaluation is grounded on the CTU-13 “Neris” botnet scenario, a widely used benchmark, with all models trained strictly on synthetic data to preclude overfitting to artifacts in the validation set. The following results are reported:

Detection Metrics (CTU-13 Neris):
- F1: 0.9985
- Precision: 0.9969
- Recall: 1.0000
- False Positive Rate: 0.0002% (2 FP in 12,075 benign flows, both attributable to non-production VirtualBox artifacts)
- Fast Detector FPR on benign traffic: 6.61%
- ML Detector reduces real production blocks to zero; approx. 500× FP reduction over heuristics alone
- Inference latency ranges: DDoS (0.24 μs), Ransomware (1.06 μs)
- All measurements on commodity hardware under a virtualized environment (upper-bound for latency, lower-bound for throughput; bare-metal evaluation is future work)
Throughput and Resource Utilization:
- Sustained packet processing up to ~34–38 Mbps (VirtualBox NIC limit) with zero packet loss or deserialization error across 2.37 million packets
- ml-detector stabilizes at ~3.2 logical cores
- RAM usage for the full pipeline stable at ~1.28 GB
- Clear evidence for queue stability and correct draining in post-replay conditions
Ablation Study:
- Config A (Fast Detector alone): 6.61% FPR on bigFlows
- Config B (ML Detector): 0 blocks on bigFlows, 12 detections on Neris
- Config C (Dual/Deployed): F1=0.9985, FPR=0.0002%, confirms effectiveness of dual-score policy

An explicit caveat is that generalization to modern (post-2020) ransomware, DDoS, and encrypted C2 remains unvalidated, though the implemented system architecture readily admits further training and evaluation on new data sources.

Methodological Innovations

ML Defender was developed through a collaborative, structured peer-review process with LLMs (“Consejo de Sabios”) acting as adversarial co-reviewers across architectural, implementation, and testing phases. This process resulted in refined methodological practices—especially Test-Driven Hardening (TDH)—which emphasizes writing failing integration tests to expose latent protocol or semantic flaws not detectable through static analysis or unit-level validation.

A documented case study involving an HKDF context mismatch demonstrates the practical value of this methodology for protocol correctness, especially in cryptographic or distributed settings, where stateful or cross-layer invariants are otherwise challenging to validate.

Plugin extensibility, clear TCB boundaries, and explicit forward compatibility for cryptographic upgrades also represent substantive practical contributions over prior open-source NIDS/EDR projects.

Implications, Limitations, and Future Directions

Practically, ML Defender demonstrates that high-accuracy, real-time ML-based intrusion detection and active response are achievable on commodity hardware, making such protection accessible to resource-constrained institutions. The strong numerical results on CTU-13 Neris establish feasibility within the tested regime but stop short of universal detection claims. The architecture’s use of interpretable classical ML (Random Forests) facilitates engineering transparency, low-latency inference, and minimal dependencies.

The system’s current generalization is limited by the use of synthetic training data and restriction to legacy attack patterns present in CTU-13, forward-declared by the author in the limitations section. Throughput evaluation is bottlenecked by the virtualized environment, not inherent pipeline limitations; bare-metal validation and scaling to multi-gigabit environments remain medium-term priorities.

Broader theoretical implications center on the paradigm that commodity deployments, local inference, and deterministic architectures can be made robust and practical at operational scale, with further research needed into distributed collaborative learning, federated deployment, and strong cryptographic isolation (potentially involving hardware-backed roots of trust).

Future work includes:

Expansion to contemporary threat corpora and post-2020 attack variants
Full real-feature set implementation and feature importance assessment
Distributed deployment, high-availability coordination, and peer-to-peer key negotiation
TL-bounded response, honeypot, and behavioral quarantine strategies
Completion of the federated telemetry and signed plugin learning pipeline for ongoing model improvement without cloud-based vendor lock-in

Conclusion

ML Defender (aRGus NDR) concretely demonstrates that modern ML techniques—properly engineered—can close the operational gap for network intrusion detection and response in under-resourced environments. Its end-to-end determinism, reproducible pipeline, explicit cryptographic discipline, and robust empirical methodology mark a significant step toward democratizing enterprise-grade security. While present validation is limited to the CTU-13 Neris scenario, the detailed account of architectural choices, testing strategy, and numerical performance provide a strong foundation for future extensibility, empirical validation, and real-world deployment in risk-vulnerable organizations.

Markdown Report Issue