Open Agentic Schema Framework

Updated 30 September 2025

Open Agentic Schema Framework (OASF) is a formal, extensible, and agent-driven approach for managing schemas in heterogeneous multi-agent AI systems.
It employs multi-agent simulations with roles like Analyst, Critic, and Verifier to iteratively refine and validate semantically coherent database views.
Its content-addressable, decoupled architecture ensures secure, scalable, and forward-compatible evolution of agent capabilities and metadata.

The Open Agentic Schema Framework (OASF) is a formal, extensible, and agent-driven approach for the definition, refinement, indexing, and discovery of schemas in heterogeneous, multi-agent artificial intelligence systems. OASF provides a structured method for representing agent capabilities, metadata, and semantic relationships—serving as both an interoperability substrate and a practical toolkit for emerging use cases such as schema refinement, multimodal knowledge extraction, federated agent directories, and agentic orchestration across distributed environments. By embedding agentic principles and modularity at its core, OASF enables scalable, autonomous, and versioned evolution of semantic layers within complex AI ecosystems.

1. Core Principles and Formal Structure

OASF is characterized by a focus on agentic and semantic abstraction rather than static, human-curated schemas. Its foundational tenets include:

Semantic Layering: OASF mediates between raw, often unwieldy enterprise data schemas and downstream users or applications by expressing distilled, easy-to-interpret database views or agent metadata. These views or records function as a refined, semantically coherent schema that captures key entities, relationships, and capabilities.
Multi-Agent Simulation for Schema Refinement: The schema refinement process is cast as a multi-agent LLM simulation. Specialized agents—such as Analyst, Critic, and Verifier—engage in iterative rounds of proposal, critique, and verification to invent, review, and validate candidate schema components. The canonical abstraction for a refined query is:

$Q \equiv \mathcal{R}(V_1, V_2, \ldots, V_k)$

where each $V_i$ is a view and $\mathcal{R}$ is a relational operator (e.g., join, union).

Composable Agent Records: At the directory and registry level, OASF defines agent records as versioned, extensible artifacts including skills, domains, metadata, evaluation results, and hierarchical taxonomies. These records are resistant to schema drift and backwardly compatible.
Content-Addressable, Decoupled Architecture: Semantic schemas (capability indices) and physical content locations (artifacts, endpoints) are decoupled. Identification and retrieval employ content-addressed storage using cryptographic digests (e.g., SHA-256), facilitating tamper evidence and deduplication.
Schema-Driven Extensibility: Extension (e.g., new modalities, evaluation metrics, prompt bundles) is enabled through a forward-compatible design that does not invalidate prior records.

OASF’s distinctive approach to schema discovery and refinement is realized through collaborative LLM-driven simulations:

Roles: Analyst agents formulate analytic queries and propose intermediate database views; Critic agents review structure, logic, and naming, advocating for decompositions and normalization; Verifier agents execute SQL definitions, confirming materializability and correctness.
Iterative Process:

Start with a minimally seeded subset of the schema (sampled component, basic description).
Analyst proposes candidate view(s): $V = \pi_{\text{attributes}} (\sigma_{\text{condition}}(T))$ , with $\pi$ as projection and $\sigma$ as selection.
Critic injects modifications $\delta(V)$ to refine the logic or isolate semantics.
Verifier executes and validates each $V$ using real or simulated backend engines.
Validated views are persisted; failed views are revised, ensuring a feedback loop.
Session memory maintains the history to avoid duplication and support incremental coverage.

Empirical Outcomes: In production case studies, this process reduced large commercial schemas (e.g., 61 tables with median width 28 columns) to normalized semantic layers (e.g., 1146 views with median width 3), preserving most relationships and introducing new cross-table associations.

3. Hierarchical Taxonomies, Extensible Metadata, and Directory Services

OASF formalizes agent, view, and capability descriptions through hierarchical, versioned taxonomies and extensible metadata fields:

Taxonomy-Driven Indexing: Skills and agent capabilities are catalogued using dotted notation (e.g., nlp.summarization.abstractive). Taxonomies define bounded index fan-out and admit efficient multi-dimensional queries.
Directory Interoperability: Platforms such as the Agent Directory Service (ADS) adopt OASF for the schema layer—enabling multi-level mapping from skills to content identifiers (CIDs), and from CIDs to storage endpoints. ADS supports content-addressed, deduplicated storage over Open Container Initiative (OCI) registries, uses Kademlia DHTs for decentralized index distribution, and supports cryptographically signed, provenance-tracked agent records.
Modality and Feature Extensions: New agent types (LLM prompt bundles, MCP servers, agent-to-agent protocol descriptors) are integrated by extending the schema with non-breaking fields.

4. Security, Provenance, and Performance Considerations

OASF’s architecture incorporates several mechanisms to ensure security, verifiability, and operational efficiency:

Content Integrity and Trust: All agent records and views are associated with immutable digests (SHA-256). Optional cryptographic signing, such as with Sigstore, establishes provenance and supports keyless signatures validated in transparency logs.
Decentralized Discovery: Use of Kademlia-based DHTs enables efficient, federated lookup of agent records and capability indices without centralized trust anchors or global consensus, facilitating robustness and discovery performance.
Query Semantics and Caching: Indices are built as posting lists mapping skills/dimensions to CIDs. Performance is improved by separating the small, compressible index layer (frequently cached, proactively replicated) from the bulk artifact/content layer (retrieved on demand). Proactive cache scoring uses:

$S = \alpha f + \beta r + \gamma c$

where $f$ is frequency, $r$ recency, and $c$ compression ratio.

5. Applications and Case Studies

OASF enables a range of scenarios in contemporary AI and multi-agent systems:

Enterprise Schema Refinement: OASF’s multi-agent simulation layer yields a normalized, semantically meaningful set of database views, facilitating improved downstream analytics and text-to-SQL conversion. This demonstrates practical gains in manageability and interpretability of enterprise data (Rissaki et al., 25 Nov 2024).
Multimodal Annotation and Retrieval: Frameworks such as RAVEN extend OASF principles to multimodal contexts, dynamically generating schemas for video collections and facilitating structured entity extraction within large-scale, unstructured domains (Rosa, 3 Mar 2025).
Agent Capabilities Indexing and Discovery: ADS leverages OASF for agent registration, enabling multi-dimensional, verifiable discovery mechanisms applicable to domains such as clinical summarization, federated enterprise agents, and cross-jurisdictional artifact replication (Muscariello et al., 23 Sep 2025).
Hybrid Retrieval and Literature Review: An OASF-based hybrid RAG framework applies agentic decision logic to select among retrieval strategies (GraphRAG/VectorRAG), quantifies uncertainty, and adapts instruction tuning per researcher needs, supporting scalable, agent-driven scientific discovery and review workflows (Nagori et al., 30 Jul 2025).

6. Extensibility, Interoperability, and Research Directions

OASF’s schema-driven philosophy and decoupled architecture enable:

Forward-Compatible Evolution: Versioned schemas admit incremental extension for new agent modalities, features, and evaluation metrics without disrupting existing deployments.
System Interoperability: Reuse of standards such as OCI/ORAS for artifact distribution, integration with external naming and registry schemes, and compatibility with labeling and provenance solutions are supported.
Open Challenges and Research: Coordination complexity, verification under diverse schema conditions, proactive agent discovery, and schema evolution strategies remain areas for ongoing refinement and theoretical paper.
Community Foundations: OASF serves as a template for further research, community-driven schema proposals, and the systematic construction and governance of large, heterogeneous agent ecosystems.

7. Summary and Outlook

The Open Agentic Schema Framework operationalizes multi-agent collaboration, content-addressable and decoupled architecture, forward-compatible metadata taxonomies, and hierarchical capability indices to deliver a robust infrastructure for schema management in modern multi-agent AI environments. Its principles are demonstrated empirically across database semantic layering, multimodal entity extraction, hybrid retrieval, and federated directory services, positioning OASF as a foundational paradigm for scalable, secure, and extensible agent ecosystems.

PDF Markdown Chat (Pro)

References (4)

Towards Agentic Schema Refinement (2024)

RAVEN: An Agentic Framework for Multimodal Entity Discovery from Large-Scale Video Collections (2025)

The AGNTCY Agent Directory Service: Architecture and Implementation (2025)

Open-Source Agentic Hybrid RAG Framework for Scientific Literature Review (2025)

Follow Topic

Get notified by email when new papers are published related to Open Agentic Schema Framework (OASF).

Open Agentic Schema Framework

1. Core Principles and Formal Structure

2. Multi-Agent Simulations and Schema Refinement

3. Hierarchical Taxonomies, Extensible Metadata, and Directory Services

4. Security, Provenance, and Performance Considerations

5. Applications and Case Studies

6. Extensibility, Interoperability, and Research Directions

7. Summary and Outlook

Follow Topic

Continue Learning

Open Agentic Schema Framework

1. Core Principles and Formal Structure

2. Multi-Agent Simulations and Schema Refinement

3. Hierarchical Taxonomies, Extensible Metadata, and Directory Services

4. Security, Provenance, and Performance Considerations

5. Applications and Case Studies

6. Extensibility, Interoperability, and Research Directions

7. Summary and Outlook

Follow Topic

Continue Learning

Related Topics