Papers
Topics
Authors
Recent
2000 character limit reached

AI/ML Model Cards in Edge AI Cyberinfrastructure: towards Agentic AI (2511.21661v1)

Published 26 Nov 2025 in cs.DC and cs.DB

Abstract: AI/ML model cards can contain a benchmarked evaluation of an AI/ML model against intended use but a one time assessment during model training does not get at how and where a model is actually used over its lifetime. Through Patra Model Cards embedded in the ICICLE AI Institute software ecosystem we study model cards as dynamic objects. The study reported here assesses the benefits and tradeoffs of adopting the Model Context Protocol (MCP) as an interface to the Patra Model Card server. Quantitative assessment shows the overhead of MCP as compared to a REST interface. The core question however is of active sessions enabled by MCP; this is a qualitative question of fit and use in the context of dynamic model cards that we address as well.

Summary

  • The paper introduces the Patra Model Card framework, shifting static audit artifacts to dynamic, lifecycle-driven entities for continuous model traceability.
  • It empirically compares REST and MCP protocols, showing REST’s lower latency for small payloads while highlighting MCP’s session-oriented benefits for large, interactive deployments.
  • The study emphasizes integrating runtime telemetry and provenance in edge environments to support responsible, agentic AI operations with enhanced auditability.

Dynamic AI/ML Model Cards and Agentic Edge AI: Systemization and Protocol Integration

Introduction and Motivation

This paper (2511.21661) systematically addresses the realization of dynamic, lifecycle-driven AI/ML model cards in edge-centric cyberinfrastructure, emphasizing the operational utility of such cards beyond initial benchmark reporting. Whereas model cards were originally conceived as static, human-readable audit artifacts, this work introduces the Patra Model Card framework—a system embedding continuous model usage telemetry, deployment history, and context-aware metrics for persistent traceability, operational accountability, and agentic orchestration.

The central research thrust interrogates model card implementation and utilization when deployed as interactive, dynamic objects, particularly when serving agentic AI scenarios requiring contextual, long-lived session semantics. The practical tension between RESTful statelessness and the session-based Model Context Protocol (MCP) is empirically and conceptually analyzed.

Model Card Lifecycle and System Architecture

The authors establish a formal model lifecycle for AI/ML models in edge environments, decomposing the process into program objects, serialized objects, deployment artifacts (model images), and temporally contextualized inference instances. This object-state granularity underpins the provenance and dynamism required for adaptive edge workflows. Figure 1

Figure 1: AI/ML model object lifecycle, showing the transitions among program, serialized, image, and inference instance states.

Patra Model Cards are systematically integrated within ICICLE's AI cyberinfrastructure, collecting both static (design-time metadata, fairness, XAI audit results) and runtime artifacts (usage statistics, deployment spatiotemporal traces) through distributed logging, event streaming, and automated graph population. Figure 2

Figure 2: Development-deployment pipeline with explicit model card collection points, delineating onboarding, deployment, and operational introspection phases.

The graph-based knowledge representation (Neo4j/Cypher) is central to Patra, enabling constant-time traversals for multi-hop provenance queries and facilitating automated, criteria-driven model selection by external orchestrators. Figure 3

Figure 3: Core entities of the Patra Model Card as an E-R diagram, illustrating the relationships among ModelCard, Model, Deployment, and EdgeServer nodes.

Edge Deployment and Operational Use Cases

The utility of dynamic model cards is highlighted via detailed deployment pipelines supporting environmental or field science applications. Models are provisioned and orchestrated over the ICICLE continuum (cloud-edge-field), with inference results, telemetry, and resource metrics automatically reported and integrated through plugins leveraging messaging streams (Kafka, ZMQ). Figure 4

Figure 4: Realized pipeline in field research, demonstrating event-driven operation, instrumentation plugins, and model card population at the edge.

This continuous integration of operational metadata shifts model cards from static forms to living knowledge objects, maintaining traceability and explainability not only at the point of model release but throughout the deployment/usage continuum.

Protocol Interface Analysis: REST vs. Model Context Protocol

A major contribution of this work is the comparative empirical evaluation of RESTful endpoints versus MCP-based interfaces for serving model card operations. MCP's persistent, session-oriented, JSON-RPC-based abstraction offers transactionality suitable for agentic AI workflows, contrasting with REST's stateless, atomic semantics.

Microbenchmarks on Jetstream2 reveal non-trivial overheads for MCP (26.7 ms for model card retrieval) compared to REST (7.5 ms), with the layered MCP-over-REST variant imposing a further 16% overhead due to dual serialization and HTTP handshakes. For large model cards (~13.63 MB), database retrieval times dominate (7843 ms), with protocol differences attenuated due to data transfer bottlenecks. Wide-area deployment scenarios further amplify connection and negotiation costs. Notably, for small payloads REST maintains significant advantage, but for large, complex, multi-institution environments, MCP’s additive overhead is less consequential. Figure 5

Figure 5: Model card retrieval times for small model cards across protocols, highlighting REST’s efficiency.

Figure 6

Figure 6: End-to-end breakdown for REST retrieval of large model cards, showcasing connection, processing, and transfer phases.

Figure 7

Figure 7: Native MCP latency profile for large model cards, evidencing protocol-specific cost under high-throughput scenarios.

Figure 8

Figure 8: Layered MCP latency decomposition, highlighting the additive overhead over native implementations.

Figure 9

Figure 9: Aggregate turnaround times (log-scaled) across approaches and deployment configurations.

These results are interpreted through the lens of protocol design. Native MCP offers session-oriented interaction well-aligned with agent/assistant requirements—stateful notification, long-running tasks, and tool/resource registration—albeit with currently higher serialization and management costs.

Discussion and Agentic AI Implications

Treating model cards as active objects is positioned as foundational for agentic AI. MCP’s session semantics and server-push (SSE) capabilities naturally support agent subscription models, e.g., agents monitoring deployment anomalies or policy violations over time.

The authors propose modeling not only model lineage and usage but also model card ownership and organizational affiliation, augmenting auditability with actor context. Model cards, under this architecture, act as boundary resources for agents, automating compliance, fairness, and explainability interventions, with automated notification and traceable event-driven mutation.

The discussion references the potential for integrating human-in-the-loop interventions to mitigate agentic cycles (e.g., infinite ReAct loops), linking dynamic model card information to actionable alerts and enforcement mechanisms for responsible AI.

Relation to Prior Art

The proposed architecture extends documentation paradigms (model cards, data cards [mitchell2019model, gebru2021datasheets]) to the cloud/edge continuum, operationalizing FAIR principles and signposting [fairsignposting, van2023fair] for persistent, machine-actionable discovery. Prior MLOps platforms (MLflow, LinkEdge [dias2023linkedge]) lack the dynamic, continuous lifecycle traceability introduced here. Literature on provenance [belhajjame2013prov, souza2022workflow] and responsible AI [memarian2023fairness, AIaccountability] further contextualizes the novelty of session-based, agentic interaction built atop dynamic model cards.

Conclusion

This work demonstrates the necessity and viability of dynamic, provenance-rich model cards that capture model usage throughout the full operational lifecycle in edge-centric AI systems. Empirical analysis of REST and MCP protocols provides actionable guidance on integration tradeoffs, revealing that while REST is performant for atomic queries, MCP better aligns with sessional, agent-driven workflow orchestration despite measurable overheads. Advancing agentic AI in heterogeneous, distributed infrastructure will require both protocol optimization (e.g., minimizing serialization cost) and further abstraction of ownership/usage semantics at the model card level.

Dynamic model cards, as implemented in the Patra framework, are positioned as essential substrates for responsible, continually auditable, and context-aware inference deployment, ultimately supporting the broader objectives of agentic AI governance, reproducibility, and compliance.

Whiteboard

Video Overview

Collections

Sign up for free to add this paper to one or more collections.