Papers
Topics
Authors
Recent
Search
2000 character limit reached

JARVIS: Integrated Materials Simulation Repository

Updated 29 May 2026
  • JARVIS is a comprehensive open-access platform integrating quantum, classical, ML, and experimental methods for data-driven materials design.
  • It provides unified databases, tools, web apps, and benchmarks that ensure transparency, reproducibility, and adherence to FAIR principles.
  • The infrastructure supports advanced simulation techniques including DFT, force fields, graph neural networks, and quantum algorithms to accelerate both forward and inverse materials discovery.

The Joint Automated Repository for Various Integrated Simulations (JARVIS) is a comprehensive, open-access infrastructure supporting data-driven materials design through tight integration of quantum-mechanical calculations, classical modeling, machine learning, and experimental data. Originating at the National Institute of Standards and Technology (NIST), JARVIS provides unified databases, workflows, tools, and benchmarks covering thousands of elemental, binary, ternary, and more complex compounds, with a focus on transparency, reproducibility, and extensibility. Its architecture and scope support multiscale and multimodal research, accelerating both forward and inverse approaches to materials discovery (Choudhary, 6 Mar 2025, Wines et al., 2023, Choudhary et al., 2020).

1. System Architecture and Core Components

JARVIS implements a modular platform that unifies theoretical and experimental methodologies under a single data model and automated workflow engine. The infrastructure consists of six integrated core components:

  1. Databases:
    • JARVIS-DFT: Contains ∼80,000 materials with structural, electronic, optical, and mechanical properties computed via density functional theory (DFT).
    • JARVIS-FF: Compiles classical force-field outputs for ∼2,000 materials.
    • JARVIS-ML and JARVIS-Exp: Store machine learning predictions and experimental data (microscopy, diffraction, cryogenics).
    • External collections: Alexandria, OQMD, AFLOW, NOMAD.
  2. Tools:
    • jarvis-tools: Open-source Python libraries providing wrappers for VASP, Quantum ESPRESSO (QE), LAMMPS, GULP, ASE, Qiskit, and other codes, automating input generation, output parsing, and data ingestion.
  3. Interactive Web Apps:
    • Browser dashboards and RESTful APIs for data exploration, visualization, and property prediction.
  4. Benchmarks:
  5. Tutorials:
    • Over 100 Jupyter, Colab, and SLMat notebooks on electronic structure, ML, force-fields, and quantum computing.
  6. Outreach:
    • Regular workshops (AIMS, QMMS, JARVIS-School), webinars, and collaborative hackathons.

All data adhere to FAIR principles: findability (unique JARVIS IDs, indexed metadata), accessibility (REST APIs, database dumps, open-source code), interoperability (JSON/YAML schemas, standardized units, HDF5 for large arrays), and reusability (versioned datasets, full provenance information) (Choudhary, 6 Mar 2025).

2. Supported Theoretical and Experimental Methodologies

JARVIS encompasses a full spectrum of computational and experimental approaches:

  • Quantum-Mechanical Methods:
    • Density Functional Theory (DFT): Plane-wave methods implemented in VASP and QE, using LDA, GGA (PBE), meta-GGA, OptB88vdW, and TB-mBJ functionals. Central equation:

    [22m2+Vext(r)+VH[ρ](r)+VXC[ρ](r)]ψi(r)=εiψi(r)\left[ -\frac{\hbar^2}{2m}\nabla^2 + V_\mathrm{ext}(\mathbf{r}) + V_\mathrm{H}[\rho](\mathbf{r}) + V_\mathrm{XC}[\rho](\mathbf{r}) \right] \psi_i(\mathbf{r}) = \varepsilon_i \psi_i(\mathbf{r}) - Tight-Binding Models (QETB): Self-consistent charge TB with two- and three-body terms, periodic table coverage. - Dynamical Mean-Field Theory (DMFT): Local Hubbard UU implemented via TRIQS. - Quantum Monte Carlo (QMC): Variational and diffusion Monte Carlo with Metropolis sampling. - Quantum Computing: VQE via Qiskit/BenchQC on model Hamiltonians.

  • Machine-Learning and Data-Driven Approaches:

    • Fingerprinting: CFID descriptors, Magpie features, site-chemistry, matminer integration.
    • Graph Neural Networks: ALIGNN, AtomVision; line-graph methods for two- and three-body encoding.
    • Transformer Models: AtomGPT and DiffractGPT for generative and inference tasks.
  • Classical Methods:
    • Empirical Molecular Dynamics: LAMMPS, GULP employing EAM, Buckingham, ReaxFF potentials.
    • ML-Based Force Fields: ALIGNN-FF for quantum-accuracy MD simulations.
  • Experimental Techniques:

3. Data Model, Storage, and Access

Material entries are managed with a hybrid relational/NoSQL schema, encompassing:

  • Core tables: composition, structure, computed and experimental properties.
  • Metadata: JSON encoding for full input provenance—including calculation parameters and software versions.
  • Large arrays (band structures, DOS, phonons): HDF5 storage.
  • Volumes: ∼6 million unique materials, ∼10 million properties, >8 million leaderboard datapoints, with 2 million database downloads to date.
  • API endpoints support:
    • GET requests for structured and electronic data in JSON.
    • POST requests for property prediction using ML surrogates (e.g., ALIGNN).
  • Schema snapshots are versioned and released via Figshare (Choudhary, 6 Mar 2025).

4. Algorithms, Models, and Benchmarking

JARVIS features a range of algorithmic workflows and best-practice model development:

  • Quantum and Classical Simulations: Tight convergence on force, energy, and stress. For phonons, finite-difference and DFPT workflows, with full force-constant and phonon DOS archival (Gurunathan et al., 2022, Choudhary et al., 2020).
  • ML Models:
    • ALIGNN: Two coupled graphs (structural and line) with message passing layers; outperforms analytic models (Debye, Born–von Kármán) for phonon DOS and thermodynamic properties.
    • Loss functions: For regression, L=1Ni(yipredyiDFT)2+λθ2\mathcal{L} = \frac{1}{N}\sum_i (y_i^\mathrm{pred} - y_i^\mathrm{DFT})^2 + \lambda\|\theta\|^2.
    • Benchmarking: JARVIS-Leaderboard provides live, community-driven ranking with uniform metrics (MAE, RMSE, R2R^2, MAD/MAE), versioned data splits, and publicly submitted scripts, ensuring full reproducibility. Over 1,200 contributions span 274 distinct benchmarks and 8.7 million ID–prediction pairs (Choudhary et al., 2023).
  • Example Use Cases:
    • High-throughput DFT screening (e.g., discovery of 55 2D magnetic topological insulators in 48 hours).
    • Thermoelectric ML predictions: band gap MAE <0.1 eV/Å.
    • Mechanical properties: ALIGNN-FF predicts elastic constants within 5 GPa at 1,000× speedup.
    • Inverse design: AtomGPT proposes 1,200 candidate superconductors, with 30 verified (Tc>20T_c > 20 K).
    • Benchmark: QMC reduces DFT+U energy errors from 0.3 eV to 0.05 eV for CrX₃.
    • MLFF achieves vacancy formation energies within 0.1 eV (vs 0.4 eV typical for EAM/MEAM) (Choudhary, 6 Mar 2025).

5. Advanced Domains: Graph Networks, Force Fields, Vision, Language, and Quantum

JARVIS integrates multiple state-of-the-art algorithmic paradigms:

  • Graph Neural Networks and ML Force Fields:
    • ALIGNN-FF covers 89 elements, using energy-consistent, atom-wise graph convolutions for robust, equivariant force field predictions.
  • Universal Tight-Binding: Active-learning-derived Hamiltonians for rapid, transferable energetics.
  • Computer Vision:
    • AtomVision: Automated simulation and analysis of STM and HAADF–STEM images (U-Net segmentation, blob/pattern detection, ResNet classifiers).
  • Natural Language Processing:
    • ChemNLP: Text classification, named-entity recognition (XLNet F1 = 87%), summarization (T5, ROUGE = 46.5%), and linkages to materials datasets.
  • Quantum Algorithms:

6. FAIR Principles, Reproducibility, and Community Impact

Openness and reproducibility are enforced at every level:

  • Permanent DOIs for datasets (Figshare) and containerized software workflows.
  • Release of code, data, models, benchmarks, and notebooks with each peer-reviewed publication.
  • Jupyter and Colab notebooks guarantee figure/table reproduction.
  • JARVIS-Leaderboard mandates public code, reproducibility scripts, metadata documentation (software, hardware, wall time), and community submission of benchmarks and results.
  • Community metrics: >150,000 unique users since 2017, >4,000 citations, 2 million downloads, and extensive adoption in industry and academia.
  • Outreach: 4 AIMS, 3 QMMS, 10 JARVIS-Schools, with archives and instructional materials freely available (Choudhary, 6 Mar 2025, Choudhary et al., 2023).

7. Scope, Scalability, and Practical Utility

JARVIS covers a diverse set of materials and properties, supporting exploratory and applied workflows:

  • Materials classes: 3D, 2D, low-dimensional, molecules, topological insulators, superconductors, 2D magnets, metal–organic frameworks, defect/interface systems.
  • Properties: Formation energies, band gaps, elastic moduli, piezo-/dielectric tensors, phonons, electronic/thermal transport, magnetic moments, surface and defect energetics.
  • Software and data accessibility: Open via https://jarvis.nist.gov, REST API, PyPI/GitHub for jarvis-tools and associated packages.
  • Computational efficiency: ML models (e.g., ALIGNN) enable 10410^4106×10^6\times speedup over direct DFT, enabling large-scale screening and design.
  • Benchmark integration: Provides systematic, reproducible comparative analysis across AI, electronic structure, force-fields, quantum computation, and experiment (Choudhary, 6 Mar 2025, Wines et al., 2023, Choudhary et al., 2023, Gurunathan et al., 2022, Choudhary et al., 2020).

JARVIS functions as an end-to-end ecosystem for data-driven materials design, bridging simulation, experiment, and machine intelligence. Its multiscale, multimodal databases and state-of-the-art software infrastructure serve as a foundation for reproducible scientific discovery, rapid prototype-to-pipeline development, and transparent benchmarking in computational materials science.

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Joint Automated Repository for Various Integrated Simulations (JARVIS).