Papers
Topics
Authors
Recent
Search
2000 character limit reached

OptiMat Alloys: A FAIR End-to-End Agent with Living Database for Computational Multi-Principal Alloy Exploration

Published 23 Apr 2026 in cond-mat.mtrl-sci | (2604.21850v1)

Abstract: The FAIR principles have transformed how computational data and workflows are shared in materials research, yet existing repositories can only serve pre-computed entries -- broad coverage is perpetually incomplete and cannot adapt to new questions on demand. To address these challenges, we present OptiMat Alloys, a LLM-powered conversational agent for multi-principal element alloy exploration built on three pillars: a living database that stores every calculation with provenance, low-barrier accessibility through a web interface requiring zero programming expertise, and built-in uncertainty quantification via cross-potential and cross-configuration validation (see demo here https://youtu.be/lQzuorkzPMc). Coupling foundational machine learning interatomic potentials covering near-all periodic table of elements with natural-language interaction, OptiMat Alloys enables targeted, on-demand computation guided by the user's domain knowledge-extending FAIR from pre-computed repositories to on-demand knowledge generation and making computational alloy screening accessible to any materials scientist.

Authors (2)

Summary

  • The paper introduces an agentic system that integrates LLM orchestration with U-MLIP simulations for efficient, on-demand alloy property prediction.
  • The paper demonstrates a million-fold speedup over DFT while maintaining near-DFT accuracy and automated, FAIR-compliant data handling.
  • The paper offers a modular, interactive architecture that supports autonomous, reproducible, and scalable exploration of multi-principal alloy spaces.

OptiMat Alloys: A FAIR-Compliant Agentic System for Multi-Principal Alloy Exploration

Context: Data Infrastructure Challenges in Alloy Discovery

Despite the proliferation of FAIR (Findable, Accessible, Interoperable, Reusable) repository infrastructure supporting computational materials science, existing databases built on these principles cannot adapt to demand-driven exploration. The combinatorial explosion in multi-principal element alloys (MPEAs), with over 10710^7 possible quinary systems and 105410^{54} candidates for higher-order systems, illustrates the coverage deficit. Current repositories (e.g., Materials Project, NOMAD, AFLOW) and even advanced modeling environments facilitate only precomputed queries, and their knowledge capture is limited by the number of competent simulation practitioners and the laborious deposition pipelines. As shown below, only a minuscule fraction of the theoretical alloy design space is mapped in current thermodynamic or computational databases. Figure 1

Figure 1

Figure 1: The combinatorial explosion of possible alloy systems vastly exceeds the coverage available in existing thermodynamic databases and experiments.

Paradigm Evolution: Agentic Computing and the Role of OptiMat Alloys

Traditional atomistic workflows (Software 1.0) relied on hand-coded simulation and workflow management tools. The advent of universal ML-based interatomic potentials (U-MLIPs, Software 2.0) massively expanded the scope for near-DFT accuracy, but these tools are siloed and require substantial parameterization overhead and simulation expertise. The field is now evolving toward agentic computing (Software 3.0), where LLMs orchestrate simulation execution, bridge user intent and technical implementation through natural language, and enforce best-practices guardrails. OptiMat Alloys embodies this paradigm: it wraps simulation, MLIP-based computation, and rigorous provenance into an interactive, autonomous agent. The system actively retrieves prior computations, dispatches new ones as needed, and maintains a growing, queryable living database. Figure 2

Figure 2: Agentic computing evolution, integrating U-MLIPs and AI agents for on-demand, natural language-driven alloy exploration in OptiMat Alloys.

Technical Architecture and Design Features

OptiMat Alloys consists of a five-layer modular architecture:

  1. Interaction Layer: Provides a web-based chat interface for accessible, real-time interaction, leveraging Markdown rendering and Plotly visualization.
  2. Agent Layer: An AutoGen-based 'Scientist agent' interprets user intent, selects tools, chains operations, and manages context and error handling across LLM backends (OpenRouter, Ollama).
  3. Tools Layer: Encapsulates structure generation, property prediction, cross-potential/cross-configuration validation, and persistent database operations as audited APIs.
  4. Computation Layer: Employs state-of-the-art U-MLIPs (ORB, NequIP, MACE) for structure relaxation, property evaluation, and phononic calculations.
  5. Data Layer: Implements SQLite-based persistent storage with full UUID-based provenance, supporting decentralized, mergeable database federation. Figure 3

    Figure 3: Schematic of the five-layer OptiMat Alloys architecture, from web interface to database persistence.

The agent tightly constrains tool interaction through rigorous prompt engineering and parameter annotation, circumventing the non-determinism and error-proneness of unconstrained LLM code generation. Advanced LLM backends with high reasoning capacity (GLM-4.5-Air, MiMo-V2-Flash, GPT-OSS-120B) reproducibly achieve >>95\% reliability in scientific tool-calling workflows; smaller, quantized models remain bottlenecked by memory and context limitations.

Performance Evaluation: Speed, Fidelity, and Accessibility

OptiMat Alloys delivers a million-fold speedup over DFT-based simulation for alloy property screening, while retaining near-DFT accuracy. Benchmarks on CoCrFeNi FCC supercells indicate that U-MLIP computation (e.g., ORB, MACE, NequIP) performs single-point energy and force evaluations in approximately 0.10--0.20 ms per structure, independent of system sizes up to thousands of atoms. In contrast, DFT with VASP requires multiple days for equivalent evaluations on 256-atom supercells, and accuracy is retained with energy-above-hull MAEs under 30 meV/atom on standard benchmarks and formation energy/lattice parameter MAEs under 0.015 eV/atom and 0.011 Ã…, respectively.

Deployment barriers are significantly lower than for any other current agentic materials workflow system: OptiMat Alloys' containerized or cloud-based interface requires no coding skill, in stark contrast to prior systems with complex dependencies or mandatory HPC access. This accessibility broadens the contributor base to include experimentalists and non-experts, catalyzing organic knowledge accumulation.

Living Database and Autonomous Data Accumulation

A defining aspect is the persistent, self-growing database: every conversational computation is stored with full provenance, property array, and reproducibility record. As shown empirically, during developmental deployment over 54 active days, the living database grew to 491 structures spanning a range of compositional complexities, predominantly covering compositionally complex (6+ elements) and FCC alloys in line with current MPEA research interests. Figure 4

Figure 4: Longitudinal growth and content analysis of the OptiMat Alloys living database, highlighting compositional complexity spread and property coverage.

Owing to UUID-based identification and standardized schema, federating databases across independent research groups is trivial, mitigating the siloing issue of legacy repository architectures.

Case Study: On-Demand Phase Stability and Property Prediction

The agent enables rapid, on-demand screening of realistic, non-tabulated alloy compositions guided by user intuition or recent experimental trends. In a representative study, OptiMat Alloys generated and validated bulk properties for experimentally realized compositions from a combinatorial Co--Cr--Fe--Mo--Ni--W thin-film library. For a W/Mo-rich BCC composition, the agent predicted a bulk modulus K=258±2K = 258 \pm 2 GPa and Young's modulus EVRH=236±5E_\mathrm{VRH} = 236 \pm 5 GPa, exceeding thin-film values due to known effects of porosity and film strain. Phase stability, determined from QHA-calculated Gibbs energy differences, matched experimental phase observations and allowed for the quantification of phase competition and compositionally driven strain in the thin-film context. Figure 5

Figure 5: QHA-derived Gibbs free energy vs temperature for prototypical W/Mo-rich (BCC) and Co/Ni-rich (HCP) alloys, enabling phase-stability inference.

All results are retained, and future queries (e.g., cross-comparison of mechanical properties for two alloys) are resolved by retrieval and aggregation over SQS configuration ensembles, with uncertainty quantification based on both inter-potential and many-body disorder statistics. Figure 6

Figure 6: Demonstration of agentic retrieval and scientific comparison between alloy entries in the OptiMat Alloys living database.

FAIRness, the Four-V Framework, and Community Implications

OptiMat Alloys substantively advances all four V's in the big data paradigm for computational alloy discovery: Figure 7

Figure 7: Mapping of current database and agentic system limitations (left) to the specific contributions of OptiMat Alloys (right) for each of the Four V's: volume, variety, velocity, veracity.

  • Volume: Bidirectional, automatic accumulation as every query deposits results, with automated schema enforcement.
  • Variety: Universality of U-MLIPs removes forcefield parameterization bottlenecks for complex alloys, expanding system applicability.
  • Velocity: Zero-programming web deployment and LLM-driven simulation orchestration scale the data-generation rate to a larger user base.
  • Veracity: Organic uncertainty quantification through SQS ensemble statistics and cross-potential recomputation supports robust error assessment.

OptiMat Alloys productionizes methodological advances such as rigorous prompt engineering, compositional complexity management, containerized reproducibility, and full computational provenance capture. These enable organic, community-driven map-building over the combinatorial MPEA design space and reduce the expertise/maintenance/silo gap impeding existing database architectures.

Community-wide, the living-database agentic approach exposes new challenges: reproducibility in the face of evolving MLIP and LLM backends, the need for benchmarks to evaluate agent tool reliability, and the development of standards for result exchange/interoperability. OptiMat Alloys exemplifies and motivates the transition to FAIR agent systems, rather than static FAIR datasets, as the backbone of future computational materials infrastructure.

Conclusion

OptiMat Alloys demonstrates an effective integration of LLM agent orchestration, U-MLIP-driven rapid property prediction, and persistent, FAIR-compliant data accumulation for computational alloy design. It achieves near-DFT fidelity with million-fold speedup, substantially lowers user and deployment thresholds, and enables organic, autonomously accumulating coverage of the combinatorial alloy design space. The cross-potential and configuration-aware uncertainty quantification sets a new bar for veracity in high-throughput screening agents. This approach reconfigures computational materials science by merging reproducible simulation, democratized access, and persistent knowledge capture, and it foregrounds a shift toward self-growing, agent-driven research infrastructure supporting both data generation and interpretation. Future work will extend living-database orchestration to closed-loop experimental feedback, more nuanced chemistry-aware structure generation, and standardization for cross-agent comparability and validation.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.