PentestMCP: Agentic PenTest Toolkit

Updated 11 October 2025

PentestMCP is a modular, agentic penetration testing toolkit that exposes core offensive tasks as remote-procedure MCP servers.
It integrates established tools like Metasploit and supports dynamic, multi-agent workflows for reconnaissance, exploitation, and post-exploitation.
Designed for automation-as-code, it enables customizable, reusable attack scripts to orchestrate complex and adaptive testing campaigns.

PentestMCP is a modular toolkit for agentic penetration testing, designed to expose the core stages of the offensive security lifecycle as remote-procedure Model–Context–Protocol (MCP) servers. The framework enables security practitioners and agents to perform automated, coordinated, and scriptable penetration test campaigns by integrating established tools such as Metasploit through standardized APIs. This agent-oriented architecture supports dynamic, multi-agent workflows, allowing for flexible orchestration of reconnaissance, exploitation, and post-exploitation activities in a variety of environments.

1. Architectural Foundation and Design Principles

PentestMCP implements a server-oriented, agentic approach, decoupling “control intelligence” (e.g., plan selection, attack reasoning, orchestration) from the underlying execution primitives (e.g., network scanning, exploit launching). The architecture is realized as a library of MCP servers, each responsible for exposing a domain-specific set of penetration testing capabilities. Agents—powered by natural language reasoning, automated policy, or custom workflow logic—connect to these servers over MCP’s remote procedure call paradigm, enabling distributed task delegation and response collation.

The fast-agent framework is highlighted as one agent-management layer that interacts with PentestMCP’s APIs, but the architecture remains extensible with custom or model-driven agent frameworks.

2. Penetration Testing Functional Coverage

PentestMCP encapsulates the typical penetration test stages within discrete MCP servers and corresponding API endpoints:

Network Scanning & Enumeration: Agents can invoke MCP endpoints that perform comprehensive scanning (e.g., Nmap via a network_scan RPC), enumerate reachable hosts, enumerate open ports, and collect basic system metadata.
Service Fingerprinting: Additional MCP endpoints provide access to fingerprinting routines, returning details about service banners, protocol compliance, or OS fingerprinting results.
Vulnerability Scanning: PentestMCP wraps vulnerability scanners and exposes results (CVE enumerations, scan findings) through a structured MCP data model. These scans can be programmatically chained with recon activities.
Exploitation: The toolkit integrates the Metasploit RPC API through a dedicated MCP server. Agents can search for exploits, configure exploitation and payload modules based on earlier discoveries (e.g., targeting CVE–2017–0144, as demonstrated against TryHackMe “Blue” room), and launch exploits with tailored options.
Post-Exploitation: Typical persistence and information-gathering modules are included, allowing agents to automate the collection of privileged information or maintain access.

A generic workflow in a LaTeX-like formalism is supported:

$\text{For target } T,\, \text{enumerate vulnerabilities } V;\, \forall v\in V,\, \text{if exploit } E(v)\, \text{is available, launch } E(v) \text{ with options } P(v).$

3. Integration and Automation of Industry Tools

Rather than implementing bespoke security engines, PentestMCP standardizes access to established penetration testing utilities. The Metasploit RPC API is exposed as an MCP server, providing full integration with payload configuration, exploit management, session control, and post-exploitation utilities as described in official Metasploit documentation. The toolkit further references the use of agents developed with fast-agent for orchestrated, logic-driven step execution. This abstraction allows security teams to automatically chain reconnaissance, scanning, exploit search, and exploitation tasks with fine-grained control.

The system is demonstrated against known vulnerable platforms such as TryHackMe’s “Blue” room. In these scenarios, an agent performs initial network enumeration, fingerprints open services, matches discovered services to public vulnerabilities, and then automatically selects, configures, and launches Metasploit modules via MCP-RPC, closing the automation loop from discovery to successful exploitation.

4. Customizability and Workflow Composition

By modularizing penetration testing functionality as pluggable MCP servers, PentestMCP enables researchers and practitioners to compose complex, scenario-specific agent workflows. For example, a multi-agent workflow may consist of:

Task decomposition: agent splits the campaign into reconnaissance, scanning, exploitation.
Assignment: each subtask is sent to a dedicated MCP server (e.g., nmap-server, metasploit_rpc).
Coordination: a planning module collects status, escalates follow-up tasks (such as privilege escalation if a meterpreter session is opened), or adapts to intermediate findings.

This enables quantifiable, repeatable, and human-readable penetration testing scripts that integrate existing and new tools by simply registering API endpoints as MCP servers. The approach aligns with the trend towards “security automation as code,” where campaign logic is fully externalized and reconfigurable.

5. Applications, Demonstration, and Evaluation Context

While PentestMCP is introduced as a toolkit announcement, it is evaluated in controlled offensive scenarios typical in security research and education (e.g., TryHackMe, reference to CVE–2017–0144). This confirms feasibility for full-lifecycle agentic penetration tests: from initial discovery through exploitation, with all key steps orchestrated by an agent through the MCP abstraction.

The toolkit is situated in an ecosystem of emergent agentic and LLM-driven security platforms (for example, PentestGPT, Microsoft Security Copilot), reflecting the field’s progression towards integrating generative AI and programmable multi-agent systems with established offensive security tools and workflows.

6. Relevance and Position in the Security Landscape

PentestMCP fits into the ongoing transition from traditional, manual, sequential penetration testing towards a paradigm where autonomous, multi-agent workflows coordinate, automate, and optimize every phase of the attack surface discovery and exploitation process. By exposing all core tasks as MCP APIs, the toolkit creates a substrate for experimentation with AI-driven penetration testing strategies, agent composition, and adaptive attack planning, leveraging both the scalability of the MCP ecosystem and the mature capabilities of existing security tools.

In conclusion, PentestMCP provides a library of MCP-compatible servers wrapping common penetration testing helpers, fosters the development of reusable attack workflows, and enables security professionals and researchers to implement logic-driven, automated, multi-step campaigns using agentic programming principles. Its demonstration against standard vulnerable platforms and its modular, API-driven design position it as a significant enabler for future research and practice in agentic penetration testing (Ezetta et al., 4 Oct 2025).

PDF Markdown Chat (Pro)

References (1)

PentestMCP: A Toolkit for Agentic Penetration Testing (2025)

Follow Topic

Get notified by email when new papers are published related to PentestMCP.