MCP Servers in Scientific Workflows

Updated 5 November 2025

MCP Servers are thin, typed service adapters that standardize access to heterogeneous scientific cyberinfrastructure, making tools and resources discoverable and composable.
They leverage containerization and streaming-HTTP for secure, efficient orchestration across diverse APIs and computing sites, enhancing workflow integration.
Their architecture supports dynamic error handling and federated authentication (e.g., OAuth), enabling scalable, cross-domain execution of complex scientific tasks.

Model Context Protocol (MCP) servers are thin, typed service adapters designed to make scientific capabilities discoverable, invokable, and composable by LLM-powered agents within research and high-performance computing (HPC) environments. MCP formalizes access to heterogeneous scientific cyberinfrastructure (CI) through a unified protocol layer, enabling modular orchestration of workflows that cross traditional boundaries of site, software, and operational semantics.

1. Motivation and Theoretical Model

The primary challenge addressed by MCP servers in science and HPC is orchestration across highly heterogeneous CI, where APIs, authentication methods, and operational abstractions differ substantially between services. MCP provides a uniform interface whereby agents (typically LLM-driven) can discover, invoke, and compose distributed computational and data resources. This approach is formalized as follows:

User prompt: $p$
LLM: $L$
User credentials: $\Phi_{\text{user}}$
MCP server set: $\mathcal{M}$
Computing site set: $\mathcal{S}$

Each MCP server is defined as $M_k = \langle \mathcal{C}_k, \Phi_k \rangle$ , extended to $\langle \mathcal{C}_k, \Phi_k, \mathcal{P}_k, \mathcal{X}_k \rangle$ where $\mathcal{P}_k$ are exposed prompts, $\mathcal{X}_k$ are resources, and $\mathcal{C}_k$ are capabilities. A capability, $C_j$ , is a tuple $\langle \mathcal{I}_j, \mathcal{E}_j, \mathcal{D}_j, \mathcal{R}_j \rangle$ specifying interface, execution logic, description, and site requirements.

A complete workflow is modeled as: $\boxed{ \mathcal{W}(p) = \textsf{Execute}\left(\textsf{Resolve}\left(\textsf{Plan}(p, L), \mathcal{M}, \mathcal{S}\right), \Phi_{\text{user}}\right) }$ where Plan, Resolve, and Execute represent agent-driven planning, tuple resolution (site-tool-capability), and invocation.

2. Implementation Strategies and Architectural Characteristics

MCP servers are implemented as thin containers (typically Docker) that wrap mature research services such as:

Globus Transfer: Secure, performant data migration between storage systems.
Globus Compute: Remote execution of Python and shell-callable functions.
Globus Search: Metadata indexing/querying in distributed repositories.
Facility Status APIs: Real-time status of ALCF/NERSC and related HPC sites.
Octopus Event Fabric: Event streaming/consumption for system and filesystem events.
Garden: Machine learning potential catalog for scientific inference/discovery.
Rhea: Large-scale interface exposing thousands of Galaxy bioinformatics tools.

Deployment and Security Details:

Each server is deployed as a self-contained container with isolated dependencies.
Communication is via streamable-HTTP to support bi-directional, efficient, low-latency interaction.
Authentication is handled locally at the server level (OAuth 2.0 preferred), with the container managing token refresh and scope restriction. This architecture enables robust operation behind site firewalls, with minimal impact on upstream CI security posture.

3. Discoverability, Invocation, and Composition at Scale

The primary functional virtue of MCP servers is the explicit separation of discoverability and invocation. Each server makes available:

Discovery APIs: Allowing agents to enumerate available tools, prompt templates, and resources, typically returned as structured/typed descriptions or via embedding-based retrieval.
Invocation APIs: Once discovered and authenticated, tools are programmatically callable with typed argument validation.
Workflow composition: Agents dynamically chain steps by resolving the (tool, site, capability) tuple after each execution, adjusting for site-specific constraints and runtime availability.

At scale, particularly in ecosystems such as Rhea (Galaxy), naive exposure of all tools exceeds agent context window constraints. Retrieval-augmented generation (RAG) over tool embeddings—dynamic server-side vector search over tool descriptors—enables efficient, semantically-guided discovery by LLM agents, optimizing Recall@k and reducing tool context bloat.

4. Scientific Workflow Case Studies

Computational Chemistry (Garden-AI)

Objective: Molecular batch relaxation using machine-learned interatomic potentials.
MCP pattern: Garden MCP server exposes models, agent discovers, invokes remote inference (potentially on cloud/HPC), retrieves results through standardized APIs, and adapts job submission/polling patterns for batch scaling.

Multi-site Bioinformatics (Phylogenetics)

Objective: Phylogenetic tree estimation across sites (FastTree, RAxML, IQ-TREE).
MCP pattern: Status MCP identifies computable site, Globus Transfer MCP moves input/output data, Globus Compute MCP automates function/analysis execution, and agent generates all required glue code and carries out multi-site orchestration.

Quantum Chemistry

Objective: Computation of HOMO-LUMO gaps via ab initio codes.
MCP pattern: Agent coordinates multi-stage workflows (input upload, structure optimization, batch PySCF execution, status monitoring, downstream visualization) via Globus Compute MCP.

Filesystem Monitoring (Icicle + Octopus)

Objective: Aggregate and stream real-time filesystem metrics/events for administrative oversight.
MCP pattern: Octopus MCP streams file events; Globus Search MCP enables retrospective/historical queries; agent visualizes data, combines live and historical analytics, and manages event subscriptions.

Use Case	MCP Servers	Sites	Key Software
Phylogenetic	Globus Compute, Transfer	ALCF, NERSC	FastTree, RAxML, IQ-TREE
Molecular Design	Garden, Globus Compute	ALCF, Cloud	MACE, ASE
Quantum Chemistry	Globus Compute	ALCF	PySCF, GPU4PySCF
Filesystem Monitoring	Octopus, Globus Search	Local	N/A

5. Technical Challenges and Adaptivity

Authentication and Authorization

Robust authentication remains a pain point, especially for cross-domain deployments. Local handling of OAuth flows within containers is tractable, but cross-domain, multi-user scenarios require more advanced federation (e.g., FastMCP Remote OAuth). Dynamic authentication to multiple MCP endpoints is preferable to site-wide static credentials.

Scale and Discoverability

Agent context windows are outstripped by naive exposure of large tool catalogs. Embedding-based retrieval (Rhea) significantly improves retrieval accuracy in large tool deployments, mitigating context constraints.

Resilience and Self-correction

LLM agents exhibit some degree of adaptive error handling, recovering from invocation or path errors by retrying with corrected arguments or function calls. However, session-level consistency remains variable, and agents do not systematically enforce workflow invariants or output formats, leading to potentially heterogeneous pipeline results.

6. Open Issues and Research Directions

Cross-domain Hosting: Need for uniform, robust, and user-friendly authentication and trust boundary establishment for MCP servers spanning divergent security and operational domains remains unsolved.
Evaluation and Trust: Quantitative benchmarks for workflow reliability, error rate measurement, and verification of agent-driven scientific results are not established.
Resilience in Long-running Tasks: Agents managing extended, multi-step scientific workflows require improved error detection, rollback, and recovery mechanisms.
Impact of LLM/Agent Variance: Systematic assessment of different agent/LLM “personalities” or frameworks on SCP-driven workflow quality is lacking; agent robustness and adaptability vary.

7. Summary and Implications for the Domain

MCP servers, when implemented as pragmatic, lightweight adapters over mature scientific CI services, sharply lower the overhead for LLM agents to orchestrate and execute complex, distributed scientific workflows. The architecture fosters dynamic tool discovery, compositional workflow construction, and adaptive error handling. These advances extend agentic science beyond rigid script-based automation. However, unresolved challenges in authentication, resilience, cross-domain operation, and systematic trust verification limit the approach's current robustness. Progress toward standardized evaluation, security boundary hardening, and agent workflow verification is required to support wide adoption in high-stakes computational science.

The paper’s experience highlights the technical maturity of the MCP model for compositional science and high-performance computing, while also identifying critical open problems whose resolution will define its impact on next-generation research cyberinfrastructure.

PDF Markdown Chat (Pro)

Follow Topic

Get notified by email when new papers are published related to Model Context Protocol (MCP) Servers.