Agent Data Protocol (ADP)

Updated 29 October 2025

Agent Data Protocol (ADP) is a standardized, lightweight data representation language that unifies diverse agent datasets using simplicity, standardization, and expressiveness.
ADP streamlines cross-domain training by normalizing actions, observations, and metadata, thereby facilitating scalable and reproducible fine-tuning across agent frameworks.
Its efficient conversion pipeline reduces engineering overhead, yielding significant accuracy gains (≈20%) and enabling robust cross-task generalization in LLM benchmarks.

The Agent Data Protocol (ADP) is a standardized, expressive, and lightweight data representation language designed to unify heterogeneous agent datasets and enable scalable, reproducible supervised fine-tuning (SFT) of LLM agents. ADP serves as a canonical "interlingua" that bridges diverse formats for agentic data—from tool use and coding to browsing, software engineering, and general agent workflows—streamlining the integration and downstream training of LLM agents using varied corpora (Song et al., 28 Oct 2025).

1. Motivation for ADP

The proliferation of agent training datasets has led to fragmentation across highly heterogeneous formats, with each dataset encoding actions, observations, and metadata according to idiosyncratic schemas and interface conventions. This lack of standardization presents acute barriers to:

Large-scale SFT, as assembling corpora from disparate sources demands significant per-dataset engineering.
Cross-architecture dataset reusability and integration among research groups.
Systematic cross-dataset analysis and quantitative benchmarking.

ADP addresses this bottleneck not by generating new data, but by reconciling and standardizing existing, diverse datasets through a shared protocol, facilitating large-scale, diverse agent training that is both accessible and reproducible.

2. Design Principles and Schema

ADP’s design is anchored by three core principles: simplicity, standardization, and expressiveness.

Simplicity: ADP adopts an intuitive, minimal schema that is broadly accessible and adaptable, minimizing the engineering burden for integration.
Standardization: It normalizes disparate agent datasets into a common representation, directly addressing heterogeneity.
Expressiveness: Despite simplicity, ADP is sufficiently rich to encode tool/API use, programming, browsing, software engineering, and general agentic workflows.

The central abstraction is the Trajectory:

$\textbf{Trajectory} = \left( \texttt{id},\ \texttt{content},\ \texttt{details} \right)$

id: Unique trajectory identifier.
content: Alternating sequence of actions and observations.
details: Key-value metadata for dataset-specific extensions.

Actions

Three canonical action types are defined:

APIAction (tool/API usage): {function: str, kwargs: dict, description: str(optional)}
CodeAction (code generation/execution): {language: str, content: str, description: str(optional)}
MessageAction (natural language): {content: str}

Observations

Two principal observation types ensure coverage across environments:

TextObservation: {source: ("user"|"environment"), content: str}
WebObservation: {html: str, axtree: optional, url: str, viewport_size: tuple, image_observation: optional}

The ADP schema is implemented as Pydantic schemas, enabling strong typing, validation, and automated integrity checks.

3. ADP as Interlingua: Decoupling Data and Agent Architectures

ADP operationalizes a hub-and-spoke conversion paradigm. Each dataset is mapped once to ADP, and each agent harness is mapped once from ADP to its training format. Thus, for $D$ datasets and $A$ agent frameworks, the total conversion effort scales linearly ( $O(D+A)$ ) rather than quadratically ( $O(D \times A)$ ).

This architecture confers several advantages:

Plug-and-play extensibility for future datasets and agent frameworks.
Linear engineering scaling as the ecosystem grows.
Shared community leverage of existing conversion infrastructure.

Empirically, unifying 13 agent datasets required ≈4,900 LOC for dataset conversion, and only ≈77 LOC per agent-harness adapter, marking a considerable reduction over previous approaches.

4. Expressiveness and Coverage

ADP’s schema has been demonstrated to faithfully represent a wide range of agent interaction trajectories, including:

Repository-level software engineering tasks,
Web navigation and browser/GUI operations,
General tool use and complex reasoning sequences,
Datasets originating from both synthetic and human sources.

The ADP corpus exceeds 1.3 million trajectories, with wide diversity in conversational rounds, complexity, and covered domains.

5. Conversion Pipeline and Quality Assurance

ADP underpins a three-stage pipeline:

Raw → ADP: Dataset-specific actions, code, and observations are normalized into ADP types.
ADP → SFT-ready: ADP data is mapped to format-specific training inputs for agent frameworks (e.g., OpenHands, AgentLab, SWE-Agent) via focused adapters.
Quality Assurance: Automated checks enforce internal consistency, action/observation integrity, validated tool use, paired reasoning/thought, and correct conversational formatting.

This workflow guarantees high-quality, dataset-agnostic training inputs, facilitating robust model development and reproducibility.

6. Experimental Evaluation and Performance Impact

Extensive supervised fine-tuning experiments on cross-domain ADP-standardized corpora yielded:

Average absolute accuracy gain of ≈20% over corresponding base models.
State-of-the-art or near SOTA results with 7B, 14B, and 32B LLMs in benchmarks:
- SWE-Bench (software engineering): Qwen-2.5-7B-Coder-Instruct improved from 0.4% to 20.2% (+19.8%).
- WebArena (web navigation): Qwen-2.5-7B-Instruct improved from 8.3% to 21.0%.
- AgentBench, GAIA, spanning multi-domain agentic tasks.
Monotonic accuracy improvements with increased model size.
Consistent cross-task generalization: agents fine-tuned on diverse ADP data transfer positively across domains versus domain-specific SFT agents.

7. Community Release and Implications

All code, schemas, dataset adapters, agent-harness converters, and the entire 1.3M+ trajectory ADP corpus are publicly released (https://agentdataprotocol.com), promoting open reproducibility and collaborative extension. The protocol lowers barriers to contributing new datasets, with all agent frameworks benefitting immediately from unified consistency.

ADP’s adoption establishes a lingua franca for agent data in LLM agent research, enabling systematic analysis, robust cross-domain benchmarking, efficient data integration, and scalable, maintainable SFT practices. Long-term implications include accelerated scientific progress, transparent model comparison, and more diverse, capable agent ecosystems.

Markdown Report Issue Upgrade to Chat

References (1)

Agent Data Protocol: Unifying Datasets for Diverse, Effective Fine-tuning of LLM Agents (2025)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Agent Data Protocol (ADP).

Agent Data Protocol (ADP)

1. Motivation for ADP

2. Design Principles and Schema

Actions

Observations

3. ADP as Interlingua: Decoupling Data and Agent Architectures

4. Expressiveness and Coverage

5. Conversion Pipeline and Quality Assurance

6. Experimental Evaluation and Performance Impact

7. Community Release and Implications

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Agent Data Protocol (ADP)

1. Motivation for ADP

2. Design Principles and Schema

Actions

Observations

3. ADP as Interlingua: Decoupling Data and Agent Architectures

4. Expressiveness and Coverage

5. Conversion Pipeline and Quality Assurance

6. Experimental Evaluation and Performance Impact

7. Community Release and Implications

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research