ShieldNet: Network-Level Guardrails against Emerging Supply-Chain Injections in Agentic Systems

Published 6 Apr 2026 in cs.AI | (2604.04426v1)

Abstract: Existing research on LLM agent security mainly focuses on prompt injection and unsafe input/output behaviors. However, as agents increasingly rely on third-party tools and MCP servers, a new class of supply-chain threats has emerged, where malicious behaviors are embedded in seemingly benign tools, silently hijacking agent execution, leaking sensitive data, or triggering unauthorized actions. Despite their growing impact, there is currently no comprehensive benchmark for evaluating such threats. To bridge this gap, we introduce SC-Inject-Bench, a large-scale benchmark comprising over 10,000 malicious MCP tools grounded in a taxonomy of 25+ attack types derived from MITRE ATT&CK targeting supply-chain threats. We observe that existing MCP scanners and semantic guardrails perform poorly on this benchmark. Motivated by this finding, we propose ShieldNet, a network-level guardrail framework that detects supply-chain poisoning by observing real network interactions rather than surface-level tool traces. ShieldNet integrates a man-in-the-middle (MITM) proxy and an event extractor to identify critical network behaviors, which are then processed by a lightweight classifier for attack detection. Extensive experiments show that ShieldNet achieves strong detection performance (up to 0.995 F-1 with only 0.8% false positives) while introducing little runtime overhead, substantially outperforming existing MCP scanners and LLM-based guardrails.

Abstract PDF Upgrade to Chat

Authors (8)

Summary

The paper presents a network-level detection framework (ShieldNet) that leverages runtime network traces to identify stealthy supply-chain attacks.
The methodology features a novel SC-Inject-Bench benchmark with nearly 20,000 execution traces and a lightweight ML model achieving up to 0.995 F-1 score.
The approach demonstrates real-time, efficient detection with minimal overhead, outperforming traditional MCP scanners and IDS in dynamic LLM ecosystems.

ShieldNet: Network-Level Guardrails against Emerging Supply-Chain Injections in Agentic Systems

Introduction

The proliferation of agentic systems, particularly those leveraging LLMs with Model Context Protocol (MCP) for dynamic third-party tool integration, has exposed new vectors for supply-chain attacks. While current security paradigms predominantly focus on semantic-level anomalies—such as input/output analysis and prompt injection—these approaches are fundamentally limited against attacks where malicious payloads are injected directly into benign-seeming tool implementations. "ShieldNet: Network-Level Guardrails against Emerging Supply-Chain Injections in Agentic Systems" (2604.04426) addresses this critical gap by systematically characterizing these threats, developing a representative benchmark (SC-Inject-Bench), and introducing ShieldNet, a network-level detection framework operating entirely on execution-time runtime traces.

Figure 1: Comparison between semantic-level and network-level views of tool execution. Network-level analysis surfaces stealthy malicious behaviors invisible to semantic-layer defenses.

Threat Characterization and Benchmark Construction

A principal contribution of the work is the construction of SC-Inject-Bench, a large-scale, realistic, and systematically validated benchmark for MCP supply-chain threats. Unlike prior benchmarks—such as MCPSecBench, MCPTox, and MCP Security Bench—that focus on prompt, schema, or semantic manipulation, SC-Inject-Bench performs code-level attack injection into real-world MCP tool implementations across diverse repositories and frameworks, with a taxonomy derived from 29 MITRE ATT&CK network-visible techniques.

The data curation pipeline normalizes for practical considerations: it (1) localizes injection points in heterogeneous codebases, (2) triggers attacks through agent-controlled MCP invocation, and (3) validates executions via network-level observability, filtering executions to guarantee that each malicious instance generates a detectable traffic signature.

Figure 3: End-to-end data curation pipeline of SC-Inject-Bench, integrating code-level injection, agent execution, and network-based verification.

The resulting benchmark comprises 109 servers, 984 tools, and nearly 20,000 benign/malicious execution traces spanning substantial real-world diversity (Figures 6–13). Its scale and validation methodology surpass those of all existing MCP security datasets—making it uniquely rigorous for evaluating network-aware guardrails.

ShieldNet Architecture

ShieldNet operates by capturing full packet traces and performing application-layer decryption (via a controlled MITM proxy) to yield structured, time-ordered event sequences, which feed into a lightweight post-trained LLM for detection and categorization. This approach is strictly non-semantic: the pipeline is agnostic to tool metadata, call graphs, or I/O and relies exclusively on features observable in the network layer.

Figure 2: Overview of ShieldNet's network-based guardrail, highlighting end-to-end pipeline from packet capture to risk classification.

The key innovations in this architecture include:

Robust traffic interception: Traffic is routed through a local MITM proxy and full PCAP capture, including measures to circumvent protocol bypasses (e.g., QUIC interference).
Deterministic structured representation: Event extraction and sequence serialization maintain order, filter irrelevant noise, and efficiently compress semantically relevant signals for ML modeling, enabling effective training even on compact models (e.g., Qwen3-0.6B).
Streaming/real-time support: ShieldNet supports online detection with sliding windows, providing timely alerting during ongoing tool executions.

Experimental Analysis

Detection Performance

ShieldNet achieves strong numerical results on SC-Inject-Bench, reaching up to 0.995 F-1 with only 0.8% false positive rates at the traffic level, substantially outperforming both traditional MCP scanners (e.g., Cisco AI, Invariant Labs) and existing host/network IDS (e.g., Suricata, SAFE-NID) as well as LLM-based detectors prompted on network data. Notably, these traditional baselines either fail to generalize (static scanners, semantic-only) or exhibit significant tradeoffs between recall and FPR (IDS).

Figure 4: Class-wise F-1 scores. ShieldNet's structured approach maintains high detection across diverse attack classes, where prior methods generalize weakly.

ShieldNet's model provides robust generalization to previously unseen MCP servers and attack techniques—a critical requirement in open agentic ecosystems—achieving <3% FPR and F-1 ≈ 0.998 under out-of-distribution evaluation.

Efficiency and Real-Time Viability

ShieldNet maintains low runtime overhead (≈21% relative increase), outperforming LLM-based traffic baselines by a substantial margin. This efficiency is critical in high-frequency agentic tool-calling deployments where real-time detection is essential.

Ablation Analysis

Ablation studies demonstrate that both application-layer decryption and event-level structuring are essential: removing decryption increases FPR from 2.2% to 89.6%, and omitting both modules further degrades class-level discrimination due to context truncation and semantic misalignment.

Live Deployment and Streaming Detection

ShieldNet is validated in interactive deployments, where it intercepts and processes network events generated by front-line LLM tools (e.g., Claude Code) in real time, successfully raising alerts on stealthy runtime attacks.

Figure 5: Real-world streaming detection with live MCP server execution and sliding-window event analysis. Automated malicious execution identification during tool runtime is visualized in real time.

Implications and Future Directions

The findings underscore a central claim: semantic-only approaches are fundamentally insufficient for defense in depth against supply-chain attacks on agentic platforms. Because code-level injection bypasses interface-based and I/O-bound methods, only runtime behavioral guardrails—grounded in observable execution artifacts—can reliably defend against the new attack surface.

Practically, network-level guardrails such as ShieldNet are deployable with modest computational cost, can generalize robustly to new techniques/servers, and require no access to proprietary code or metadata. Theoretically, this shifts the design principle for LLM agent security from static policy enforcement to continuous, post-execution verification rooted in host/network telemetry.

Current limitations include incomplete coverage for purely local, non-network-resident attacks (e.g., privilege escalation, file deletion) and the inherently reactive (cf. proactive) detection paradigm. Further integration of host telemetry and multi-modal signals (e.g., file system, process introspection, semantic anomaly scoring) could yield defense-in-depth architectures spanning the full agentic supply chain.

Conclusion

"ShieldNet" (2604.04426) marks a significant advancement in supply-chain security for LLM agentic systems by (1) exposing the limitations of existing semantic and static MCP scanners, (2) introducing a rigorously validated, code-injection-focused benchmark, and (3) demonstrating that lightweight, network-level ML models can achieve near-perfect discrimination of stealthy runtime attacks while supporting real-time deployment. The shift from semantic-layer inspection to execution-aware behavioral guardrails is warranted by both scaling threats and empirical outcomes, and will likely inform future design in secure agentic ecosystem architectures.

Markdown Report Issue