Papers
Topics
Authors
Recent
2000 character limit reached

AGAPI: Open-Access AI in Materials Science

Updated 16 December 2025
  • AGAPI is an open-access AI platform that integrates over eight open-source LLMs and twenty material science endpoints to enable autonomous, multi-step workflows.
  • Its agent-planner-executor-summarizer pipeline ensures deterministic, reproducible, and scalable research with robust logging and error recovery mechanisms.
  • Performance evaluations show enhanced predictions for properties like bulk modulus, while exposing trade-offs in other metrics, guiding selective tool augmentation.

AGAPI (AtomGPT.org API) is an open-access, agentic AI platform for materials science that centralizes more than eight open-source LLMs and over twenty dedicated materials-science service endpoints. The platform is designed to unify databases, simulation engines, and ML property predictors as interoperable web services, providing an orchestrated framework for fully autonomous, reproducible, and scalable multi-step workflows in computational materials discovery. AGAPI and its reference agentic implementation, AGAPI-Agents, are deployed and maintained at https://atomgpt.org and are distributed under open-source licenses (Lee et al., 12 Dec 2025, Choudhary, 6 May 2024).

1. Conceptual Design and Objectives

AGAPI was established to address longstanding obstacles in computational materials research—namely, the fragmentation of computational ecosystems, lack of robust reproducibility, and heavy dependence on proprietary LLMs. Its principal aims are:

  • To eliminate vendor lock-in by exclusively using eight self-hosted open-source LLMs, including Llama-3.2-90B-Vision, DeepSeek-V3, Qwen3-Next-80B, and GPT-OSS-20B.
  • To provide deterministic, version-pinned access to over 20 domain-specific endpoints (e.g., JARVIS-DFT, ALIGNN, SlaKoNet).
  • To enable fully autonomous construction and execution of complex, multi-step workflows (e.g., heterostructure design, X-ray diffraction analysis, defect engineering).
  • To support over 1,000 concurrent users through horizontal scaling, asynchrony, and robust caching.
  • To ensure transparency with complete logging of request–response pairs and deterministic (low-temperature) model sampling (Lee et al., 12 Dec 2025).

2. Agentic Architecture: Agent-Planner-Executor-Summarizer

AGAPI implements an agentic pipeline comprising four modular components—Agent, Planner, Executor, and Summarizer—mirroring a "brain-hands-eyes-voice" architecture.

  • Agent (LLM reasoning engine): Interprets user intent, maintains context, and generates strict “tool call” JSON objects, enforced by a schema injected via the system prompt. Error recovery is achieved through retries and corrective system prompts upon JSON validation failure.
  • Planner: Decomposes user queries into a directed acyclic graph (DAG) of tool calls and resolves input–output dependencies. The following pseudocode encapsulates the planning logic:

1
2
3
4
5
6
7
8
function plan_workflow(user_query):
    intents = LLM.extract_intents(user_query)
    workflow = []
    for intent in intents:
        tool = select_tool_for(intent)
        workflow.append(tool)
    resolve_dependencies(workflow)
    return workflow  # list of (tool, inputs, dependencies)

  • Executor: Orchestrates synchronous/asynchronous endpoint invocations according to dependency order. Implements automatic retries for transient API failures and escalation to fallback tools.
  • Summarizer: Aggregates outputs, conducts physical sanity-checks (e.g., validation of formation energy ΔEf\Delta E_f against stability ranges), and generates concise natural-language reports, tables, and structure viewers.

The user–agentic interaction follows:

1
User → Agent → Planner → Executor → Summarizer → User
(Lee et al., 12 Dec 2025).

3. Supported Endpoints, Tool Types, and Usage Examples

AGAPI exposes its functionality via HTTPS REST endpoints secured by JWT authentication. Endpoints are grouped into four principal categories:

Category Endpoint Example Functionality Brief
Database Query /jarvis_dft/query Query JARVIS-DFT by composition, bandgap, formation energy
Property Prediction /alignn/query Predict formation energy, bandgap, elastic constants, TcT_c
Force-Field/Structure /alignn_ff/query, /generate_interface ML relaxation, heterointerface generation
Simulation/Analysis /pxrd/query, /slakonet/bandstructure XRD simulation, band structure via SlaKoNet
  • Python and REST interfaces are available. For example, to predict formation energy using ALIGNN:
    1
    2
    3
    4
    
    from agapi import AgapiClient
    client = AgapiClient(api_key="YOUR_JWT")
    resp = client.alignn.query(poscar_path="POSCAR")
    print("Formation energy:", resp.json()["predictions"]["formation_energy"])
  • Heterointerface generation and powder XRD simulation are implemented with endpoints building on the Zur algorithm and structure-specific XRD pattern calculations, respectively.
  • All endpoints are fully documented at https://atomgpt.org/docs via an OpenAPI interface (Lee et al., 12 Dec 2025).

4. Multi-Model Integration and Orchestration

AGAPI's LLM layer integrates eight open-source models, each benchmarked for scientific accuracy and inference speed. The routing stack utilizes Ollama and vLLM frameworks to host models in parallel, enabling:

  • Default routing to GPT-OSS-20B (measured at ~141.7 tokens/sec, 3.93× baseline throughput).
  • Fallback to GPT-OSS-120B on parsing or timeout errors.
  • Optional escalation to commercial LLM APIs (e.g., OpenAI GPT-4) as a last-resort.

Strict reproducibility is ensured through version pinning, temperature set to zero (or near-zero), and comprehensive logging. This infrastructure supports robust, repeatable research workflows and facilitates the transparent comparison of LLM predictions with or without tool-assisted inference (Lee et al., 12 Dec 2025).

5. Workflow Automation: Practical Scenarios and Internal Algorithms

AGAPI automates workflow construction and execution in materials design, supporting pipelines up to ten steps in length. Example scenarios include:

  • Semiconductor Defect Engineering (10-stage):

    1. Database search → 2. Structure retrieval → 3. Supercell expansion → 4. Dopant substitution
    2. ML geometry relaxation (ALIGNN-FF) → 6. XRD simulation → 7. ALIGNN property prediction
    3. Band structure calculation (SlaKoNet) → 9. Result aggregation → 10. Summary generation
  • Heterostructure Construction:

  1. Polymorph query → 2. Energy ranking → 3. Pair selection → 4. Interface building → 5. POSCAR output
  • Powder XRD Pattern Analysis:
  1. Structure identification → 2. Atom coordinate retrieval → 3. XRD simulation → 4. Peak extraction → 5. Interpretation

Key mathematical underpinnings:

  • ALIGNN's property prediction is summarized by the graph-line-graph convolution:

hv(l+1)=σ((u,v)EWeϕ(hu(l),hv(l),euv))\mathbf{h}_v^{(l+1)} = \sigma\Bigl(\sum_{(u,v)\in E} W_e\,\phi(\mathbf{h}_u^{(l)}, \mathbf{h}_v^{(l)}, \mathbf{e}_{uv})\Bigr)

  • ML force-field optimization:

Rt+1=RtηREθ(Rt)\mathbf{R}_{t+1} = \mathbf{R}_t - \eta \nabla_\mathbf{R} E_\theta(\mathbf{R}_t)

  • Tight-binding band structures via SlaKoNet:

Hαβ(k)=RHαβ(R)eikR,En(k)=eign(H(k))H_{\alpha\beta}(\mathbf{k}) = \sum_{\mathbf{R}} H_{\alpha\beta}(\mathbf{R})\,e^{i\mathbf{k}\cdot\mathbf{R}}, \quad E_n(\mathbf{k}) = \mathrm{eig}_n(H(\mathbf{k}))

AGAPI further orchestrates deterministic tool execution via asynchronous queues, dependency-aware dispatch, and fallback escalation (Lee et al., 12 Dec 2025).

6. Evaluation, Limitations, and Impact

Systematic evaluation employed 30+ prompt test suites (ag1–ag34), including both single-tool and agentic, multi-tool pipelines. Comparison against the JARVIS-Leaderboard test set using MAE and R2R^2 metrics on five material properties yielded:

  • Bulk modulus: MAE improvement from 7.876 GPa (tool-free) to 5.732 GPa (tool-augmented), R2R^2 increase from 0.984 to 0.994.
  • Band gap: MAE degradation (0.353 eV → 0.495 eV).
  • Superconducting TcT_c: 0.681 K → 3.378 K (5× increase).
  • SLME and dielectric constant: error increases up to +86 %.

These results indicate that tool augmentation is most beneficial when high-quality, database-backed inference exists (as for the bulk modulus), but can degrade performance in cases dominated by literature knowledge (e.g., band gap, TcT_c). This suggests that purely data-driven or LLM-only inference should be selectively combined with tool-driven workflows depending on property and system context (Lee et al., 12 Dec 2025).

7. Reproducibility, Performance, and Access

AGAPI ensures research reproducibility through strict model/tool version pinning, deterministic inference (temperature 0\approx 0), and full request/response archiving. Scalability is maintained through asynchronous task execution, inference batching, endpoint caching, and distributed LLM servers (Ollama, vLLM).

To access AGAPI:

  1. Install the Python client: pip install agapi-client
  2. Register and obtain an API key at https://atomgpt.org
  3. Interact via Python, REST, or the integrated OpenAPI UI
  4. Example usage for formation energy prediction:
    1
    2
    3
    4
    
    from agapi import AgapiClient
    client = AgapiClient(api_key="YOUR_KEY")
    resp = client.alignn.query(poscar_path="POSCAR")
    print("Formation energy:", resp.json()["predictions"]["formation_energy"])
    Additional documentation and reproducible notebook examples are provided in the AGAPI-Agents GitHub repository (https://github.com/atomgptlab/agapi) (Lee et al., 12 Dec 2025).

A plausible implication is that AGAPI, by modularizing agentic orchestration and supporting extensible endpoints, offers a transparent and scalable foundation for integrating future generative and predictive models, thereby catalyzing reproducible, AI-accelerated materials discovery.

Whiteboard

Follow Topic

Get notified by email when new papers are published related to AGAPI (AtomGPT.org API).