A2A Protocol for Autonomous AI Agents

Updated 27 May 2026

A2A Protocol is an open, standardized framework for secure and scalable peer-to-peer collaboration among diverse AI agents, utilizing JSON-RPC over HTTP(S).
It employs Agent Cards for discovery, authentication, and semantic negotiation, enabling dynamic skill matching and secure task delegation.
Key features include support for synchronous and asynchronous communication, modality agnosticism, and robust security through token-based and signature authentication.

The Agent-to-Agent (A2A) Protocol is an open, standardized framework that enables peer-to-peer discovery, secure task delegation, and result streaming among autonomous AI agents—primarily within large multi-agent systems and increasingly in enterprise, edge, and multimodal deployments. Originally developed by Google and now under Linux Foundation stewardship, A2A addresses a core interoperability challenge: enabling scalable, robust, and secure collaboration between heterogeneous autonomous agents with diverse skills, implementation languages, and deployment contexts (Ehtesham et al., 4 May 2025, Duan et al., 17 Aug 2025). It operates above the transport layer using JSON-RPC 2.0 over HTTP(S), with extensions for Server-Sent Events (SSE) and webhooks to support asynchronous and real-time updates across task lifecycles.

1. Architecture and Communication Model

A2A defines a rigorous yet flexible architecture centered on three principal roles: User, Client Agent, and Remote Agent. The User originator—human, service, or agent—does not communicate directly via A2A but initiates the process. The Client Agent is responsible for discovering Remote Agents (via Agent Cards at standardized endpoints, /.well-known/agent.json), selecting partners based on skill metadata, and constructing/sending task requests. Remote Agents host JSON-RPC endpoints, publish signed Agent Cards, execute declared skills, and deliver results through either synchronous JSON-RPC responses or SSE streams for partial and incremental artifacts (Ehtesham et al., 4 May 2025, Jeong, 2 Jun 2025).

The formal message model is based on JSON-RPC 2.0:

{
  "jsonrpc": "2.0",
  "id": "<task-id>",
  "method": "<skill-name>",
  "params": { ... }
}

Responses utilize either the "result" or "error" fields. For streaming scenarios, the SSE stream carries JSON-RPC fragments keyed by task ID, supporting fine-grained progress reporting and artifact delivery. Task state machines traverse canonical states—SUBMITTED, WORKING, INPUT_REQUIRED, COMPLETED, FAILED, CANCELED—with transitions orchestrated through method calls and event streaming (Habler et al., 23 Apr 2025, Duan et al., 17 Aug 2025).

2. Discovery, Handshake, and Agent Cards

Agent discovery, authentication, and semantic negotiation are centered on the Agent Card—an extensible JSON document describing an agent’s identity (optionally a DID), version, capabilities (skills with input/output schema and MIME types), supported authentication mechanisms, metadata, and sample invocations (Ehtesham et al., 4 May 2025, Li et al., 6 May 2025). Clients retrieve Agent Cards via HTTP GET and verify their integrity via JWS signatures (where supported), ensuring authenticity and integrity against spoofing and typosquatting.

A minimal handshake comprises:

Discovery: HTTP GET https://agent.example.com/.well-known/agent.json yields the Agent Card (including public keys for verification).
Signature verification on the Agent Card.
Task submission: JSON-RPC tasks/send (or tasks/sendSubscribe for streaming), including an ephemeral capability token if required by the Agent Card (Ehtesham et al., 4 May 2025, Habler et al., 23 Apr 2025).

Agent Card metadata enables dynamic skill cataloging, modality/format negotiation, endpoint resolution, and specification of required authentication mechanisms.

3. Interaction Patterns, Modalities, and Message Parts

A2A natively supports synchronous request–response, server-push streaming (via SSE), and webhook notification as fundamental interaction patterns. Each Task encompasses semantically rich Message objects—composed of multimodal Parts (TextPart, FilePart, DataPart), each with an explicit MIME type (Srinivasan, 14 Apr 2026, Li et al., 6 May 2025). By design, A2A is modality-agnostic, allowing structured delivery of text, audio, image, or structured data elements without forced proxy via text—a capability exploited in Modality-Native Routing (MMA2A).

MMA2A demonstrates that preserving native modality adds up to 38.5 percentage points of task completion accuracy in vision-dependent workflows, establishing a two-layer requirement: (1) protocol-level native routing per agent input/output modes as described in Agent Cards, and (2) downstream agents capable of high-fidelity cross-modal reasoning. Task accuracy improvements of 20 percentage points over text-bottlenecked pipelines were empirically observed but required capable multimodal LLM reasoning agents to materialize these gains. This expressiveness comes at a 1.8× latency cost for multimodal tasks (Srinivasan, 14 Apr 2026).

4. Security, Trust Models, and Risk Surface

A2A’s security posture combines standard transport-level protections (HTTPS, TLS 1.2+) with token-based authentication (OAuth 2.0 bearer tokens, JWT) and session state management. Agent Cards declare supported authentication mechanisms, with OAuth2 scopes enforced per skill (Louck et al., 18 May 2025, Louck et al., 5 Nov 2025, Anbiaee et al., 11 Feb 2026). However, multiple studies highlight security gaps:

Tokens in A2A have historically lacked lifetime control; multi-hour validity defaults enable replay attacks.
Scopes are often coarse-grained, violating least privilege—broad scopes such as "calendar.read" or "payments" allow privilege escalation.
Lack of strong customer authentication (MFA or ZKP) in the core protocol means that simple possession of a token equates to consent (Louck et al., 18 May 2025, Louck et al., 5 Nov 2025).
Agent Cards, unless explicitly signed, can be spoofed, creating vulnerabilities in peer discovery and identity assertion (Hu et al., 5 Nov 2025).
SSE streams, if unchecked, risk unfiltered data leakage and DoS attacks via flooding.

The protocol’s exposure score is high: 12 of 14 assessed vulnerabilities in a 14-point security taxonomy are present in A2A’s default deployments, with confirmed exposure in replay, impersonation, inadequate consent/audit, and message injection (Section 3, (Louck et al., 5 Nov 2025)). Mitigations include enforcing short-lived, single-use JWTs, granular per-task scopes, mandatory Agent Card signing, and per-message cryptographic signatures on RPC/SSE events. Best practices recommend the addition of CORAL- and ACP-inspired session binding, message-level integrity protection (e.g., mandatory JWS), and immutable audit logging to close remaining structural gaps (Louck et al., 5 Nov 2025, Louck et al., 18 May 2025).

Trust in A2A can be layered by combining "Brief" (TLS certificate claims), "Claim" (AgentCard self-declaration), cryptographic "Proof" (attestations or ZKPs), "Stake" (slashing-backed deposits), and "Reputation" overlays. The latter two are not native to core A2A but can be externally layered for open, adversarial settings (Hu et al., 5 Nov 2025).

5. Semantic Negotiation, Orchestration, and Protocol Interoperation

A2A’s fundamental design is to serve as the "horizontal" orchestration backbone for distributed reasoning across agents, as opposed to "vertical" protocols (e.g., MCP) mediating access to tools or external data sources. Orchestration occurs via:

Skill matching and negotiation driven by free-text and schema fields in Agent Cards; MIME-level negotiation of parts is supported, but there is no built-in ontological mapping or intent normalization. This makes semantic misalignment a recognized challenge (Li et al., 6 May 2025).
Task-centric lifecycle management, which may be audited and externally verified in robust deployments. Higher-level task coordination and consensus, as implemented in semi-centralized frameworks (e.g., Anemoi), leverages A2A for plan negotiation, progress critique, and consensus voting (Ren et al., 23 Aug 2025).
Cross-protocol integration: In ecosystem deployments, A2A is responsible for agent discovery, authentication, and delegation, while MCP orchestrates lower-level, type-safe tool integration. Combined deployments require careful attention to composite security, semantic interoperability, and unified observability. Practitioners are advised to implement correlation coverage (e.g., OpenTelemetry across both layers) and align skill descriptions to MCP tool schemas to support seamless handoff (Jeong, 2 Jun 2025, Li et al., 6 May 2025).

Best practices for robust orchestration include layering explicit lifecycle gating (FSMs), formal pre/post-conditions ("contracts") in Agent Cards, typed message schemas, and audit-enforced transitions, as exemplified by the SEMAP protocol (Mao et al., 14 Oct 2025).

6. Operational Limits, Edge/Enterprise Deployment, and Future Directions

A2A is widely deployed in enterprise and edge environments, benefiting from HTTP-native transport and modality agnosticism but challenged by issues of scalability, resource awareness, and transport overhead (Duan et al., 17 Aug 2025). Notably:

Resource constraints of edge environments (bandwidth, CPU) are not natively accommodated—no built-in compression or lightweight binary encoding exists.
Scalability is limited by centralized registries and point-to-point session models, which can become performance bottlenecks in large-scale deployments.
Discovery and identity assertions remain friction points absent signed Agent Cards or decentralized registries.

Proposed enhancements and research directions include:

Extension of Agent Cards with resource metadata to enable capacity-aware discovery and match-making.
Transition to decentralized, peer-to-peer discovery mechanisms (DHTs, gossip overlays), and adoption of lightweight serialization (CBOR, Protobuf) for constrained environments.
Enrichment with DLT-anchored AgentCards and blockchain-agnostic micropayments (x402 extension), enabling verifiable identities and compensation across open agent economies (Vaziry et al., 24 Jul 2025).
Integration with expressive delegation and provenance protocols (AIP) for chained, cryptographically verifiable authorization and audit trails (Prakash, 25 Mar 2026).
Adoption of default multimodal routing to maximize cross-modal task fidelity (Srinivasan, 14 Apr 2026).
Formalization of semantic negotiation protocols, robust contract schemas, and governance-as-code across protocol boundaries (Li et al., 6 May 2025, Mao et al., 14 Oct 2025).

7. Comparative Position and Proven Deployments

Compared with ACP, MCP, and ANP, A2A occupies the "middle of the pack" for both interoperability and security: stronger than MCP (which lacks authentication pre-v1.2) and flexible but less natively robust than ANP's DID-based decentralized architecture (Ehtesham et al., 4 May 2025, Anbiaee et al., 11 Feb 2026). Notable deployments include multi-modal customer support systems with MMA2A (Srinivasan, 14 Apr 2026), scalable multi-agent planners with Anemoi (Ren et al., 23 Aug 2025), STEM Agent's unified protocol handler (Shen et al., 22 Mar 2026), and integrated orchestrator-tool architectures exemplified in AgentMaster (Liao et al., 8 Jul 2025).

A2A’s protocol design, when combined with robust registration, identification, and context-binding, forms a practical backbone for emergent agentic ecosystems. Its evolution now focuses on resource awareness, hybrid trust architectures, modular extension for payments and attestation, and end-to-end security hardening to support the next generation of scalable, autonomous, and trustworthy AI agent networks.