Open Catalyst 2025: Catalysis Dataset & Models

Updated 12 November 2025

Open Catalyst 2025 is a comprehensive dataset and ML suite for simulating electrocatalytic phenomena at solid–liquid interfaces.
It comprises 7.8M DFT calculations across diverse materials, adsorbates, solvents, and ions to provide reliable energy, force, and solvation insights.
The platform enables scalable, GNN-based interatomic potential development and integrates multi-physics data for advanced catalysis research.

Open Catalyst 2025 (OC25) is a comprehensive dataset and suite of ML models designed to advance simulation and understanding of catalytic processes at solid–liquid interfaces. OC25 shifts the paradigm beyond earlier Open Catalyst datasets (OC20, OC22) by introducing explicit solvent and ion environments, bridging a crucial gap for modeling electrocatalytic phenomena central to energy storage and sustainable chemical production. Through its unprecedented chemical, structural, and environmental diversity, OC25 delivers a unified platform for the development, benchmarking, and deployment of scalable graph neural network (GNN) interatomic potentials capable of predicting energies, forces, and solvation effects in solution-phase catalysis.

1. Dataset Scope and Composition

OC25 comprises 7,801,261 single-point density functional theory (DFT) calculations across 1,511,270 unique explicit solvent microenvironments, making it the largest dataset for solid–liquid catalytic interfaces currently available (Sahoo et al., 22 Sep 2025). Its elemental and configurational diversity includes:

Surfaces: 39,821 unique bulk materials drawn from the Materials Project; all symmetrically distinct low-index facets (Miller index ≤ 3) enumerated and randomly tiled (8 × 8 Å in xy).
Adsorbates: 98 molecules, including all OC20 species and 13 additional reactive intermediates; systems contain 1–5 adsorbates/slab, with bias toward single-adsorbate configurations.
Solvents and Ions: Eight common solvents (e.g., water, methanol, acetonitrile, ethylene carbonate) and nine inorganic ions (e.g., Li⁺, K⁺, SO₄²⁻). Solvent and ion composition is randomly sampled, with water most prevalent; ions present in approximately 50% of structures.
Solvent Layering: Systems range from 5–10 molecular layers (weighted toward ≤6) to control computational expense.
Elemental Coverage: Surfaces, adsorbates, solvents, and ions collectively span 88 elements.
System Size and Diversity: Average system contains ~144 atoms (range 80–300).
Off-equilibrium Sampling: Populated by short NVT ab initio molecular dynamics (AIMD) trajectories at 1000 K (10–50 steps) and subsequent limited ionic relaxations (≤5 steps) to generate a broad force-norm distribution, including high-force configurations.
Data Reliability: DFT calculations filtered for force-drift consistency (total drift <1 eV/Å) to ensure gradient accuracy between energy and forces.

These features enable high-fidelity study of solvent, electrolyte, and adsorbate effects in atomistic models that closely mirror experimental electrocatalytic conditions.

2. DFT Protocols and Label Definitions

OC25 data is generated using rigorous DFT protocols optimized for scalability and reliability:

Code/Functionals: VASP 6.3.2, revised Perdew–Burke–Ernzerhof (RPBE) exchange–correlation, plus Grimme’s D3 zero-damping dispersion correction.
Pseudopotential/Basis: 400 eV plane-wave cutoff with projector-augmented wave (PAW) pseudopotentials.
k-Point Sampling: Reciprocal density of 40 (as in OC20).
Spin Treatment: All calculations are non-spin-polarized; magnetic systems excluded to avoid ambiguities in magnetic ordering.
SCF/Relaxation: Electronic convergence EDIFF = 10⁻⁴ eV for training data, 10⁻⁶ eV for validation/test; up to 5 steps for ionic relaxation.
AIMD: Short NVT runs (1000 K, 10–50 steps) sample non-equilibrium geometries.
Force-drift Filtering: Only samples with total force drift (vector sum) <1 eV/Å are retained to enforce force–energy consistency.

The dataset defines a “pseudo-solvation energy” for each adsorbed configuration:

$\Delta E_{\mathrm{solv}} \equiv \Delta E_{\mathrm{ads}}^{\mathrm{solv}} - \Delta E_{\mathrm{ads}}^{\mathrm{vac}}$

where

$\Delta E_{\mathrm{ads}}^{\mathrm{solv}} = E(\text{surface}+\text{adsorbate}+\text{solvent}) - [E(\text{surface}) + E(\text{solvent}+\text{ion box})]$

$\Delta E_{\mathrm{ads}}^{\mathrm{vac}} = E(\text{surface}+\text{adsorbate}) - [E(\text{surface}) + E(\text{adsorbate})]$

Reference energies are taken from static DFT single-point calculations with no further geometry optimization.

3. Baseline Machine Learning Models and Training Protocols

OC25 baseline models are GNNs engineered for atomistic property prediction on large, compositionally and configurationally complex systems:

Model Families:
- eSEN (expressive smooth equivariant networks): Direct and energy-conserving variants; available in small (S, 6.3 M) and medium (M, 50.7 M parameter) sizes. Direct forbids explicit force conservation; conserving guarantees $F = -\nabla E$ via direct autograd.
- UMA (“Universal Models for Atoms”): S-size (146.6 M params), pre-trained on OC20 then fine-tuned on OC25.
Feature Engineering:
- Cutoff radius: 6 Å; max neighbors (direct: 30, conserving: 300).
- Spherical harmonic channels: 128; distance basis: 64–128 Gaussians.
- Angular degrees $\ell_{\max} = 2$ –4; message-passing layers: 4–10.
Loss Function: Multi-task MSE,

$L = w_E \|E_\text{pred} - E_\text{DFT}\|^2 + w_F \|F_\text{pred} - F_\text{DFT}\|^2 + w_S \|\Delta E_{\text{solv, pred}} - \Delta E_{\text{solv, DFT}}\|^2$

where typical weights are $w_E$ : $w_F$ : $w_S$ = 10:10:1.

Training:
- AdamW (decoupled weight decay); initial LR $8 \times 10^{-4}$ (BF16 pretrain), $4 \times 10^{-4}$ (FP32 finetune).
- 40 epochs; batch sizes up to 76,800–44,800 atoms per step on Nvidia H100.
- Splits: 7.4M train, 0.2M OOD validation, 0.2M test; specialized OOD splits for solvent/ion/solvation tasks.

Performance metrics (mean absolute errors, MAE) are summarized in Table 1.

Model	Params	Energy [eV]	Forces [eV/Å]	ΔE_solv [eV]
eSEN-S (d)	6.3M	0.138	0.020	0.060
eSEN-S (cons.)	6.3M	0.105	0.015	0.045
eSEN-M (d)	50.7M	0.060	0.009	0.040
UMA-S (finetune)	146.6M	0.091	0.014	0.136

eSEN-M achieves the lowest test MAEs (0.060 eV, 0.009 eV/Å, 0.040 eV for energy, forces, and solvation energy, respectively). Out-of-distribution (OOD) energy MAE (eSEN-S conserving) rises to 0.186 eV for the “both” split (unknown bulks + unknown solvents).

Models exhibit substantial reduction in force and solvation-energy errors compared to OC20-trained baselines: e.g., force errors decrease by >50%, solvation energy by >2× relative to UMA-OC20. The models show robustness to label noise: training on EDIFF=1e-4 eV (less strict) can still generalize to lower-force noise in validation (EDIFF=1e-6 eV).

4. Complementary Datasets and Multi-Physics Integration

OC25 enables direct synergy with auxiliary datasets targeting physics domains not originally covered by OC20/OC22:

AQCat25 (Allam et al., 27 Oct 2025): Introduces 13.5M single-point spin-polarized and higher-fidelity DFT calculations for 47,000 adsorbate–slab systems, substantially expanding coverage of spin and fidelity effects. Spin is critical for 12 transition elements (e.g., Fe, Co, Ni, Cr); OC25 itself is non-spin-polarized for computational consistency, but can be extended via transfer learning.
Integration Protocols: Standard fine-tuning causes catastrophic forgetting of the original dataset (e.g., OC20 Val energy MAE degrades from 301 meV to 550 meV). Joint training with “replay” (mixing old and new physics/fidelity samples during optimization) plus explicit meta-data conditioning (e.g., Feature-wise Linear Modulation, FiLM) prevents knowledge loss and improves both AQCat25 and OC20 performance. FiLM modulates features by spin/fidelity meta-flags via learned, task-adaptive scaling.
Training Recommendations: Always replay 2–20M OC20 samples when retraining mixed-physics models; use normalized energy/force statistics for domain transfer; loss weight ratio 4:100 (energy:force).

This multi-physics, multi-fidelity protocol is foundational for OC25’s role as an extensible platform for advances in catalysis modeling.

5. Lightweight and Scalable Model Architectures

While many high-performing MLIPs push model capacity to hundreds of millions of parameters (e.g., UMA-S: 146.6M), OC25 benchmarking highlights the competitiveness of lightweight geometric message-passing approaches (Geitner, 5 Apr 2024):

Lightweight GNNs: Architectures such as MPGNN-Tiny (0.185M params) and GemNet-Mini (3.3M params) can approach DimeNet++ and SchNet accuracy at less than 1/10 the parameter count.
Design Patterns: Use of geometric and symmetric message passing, E(3)-equivariance, and direct force regression yields MAEs of 0.0827 eV/Å (MPGNN-Tiny) and 0.0748 eV/Å (GemNet-Mini) on OC20/OC25 force benchmarks, while enabling training in <20 hours on a single GPU.
Accessibility: These architectures democratize OC25 participation, facilitating prototyping and model iteration by smaller research teams.
Suggested Directions: Model compression, parameter-efficient transfer (pruning, quantization, distillation), incorporation of higher-order geometric features, and multi-task heads for simultaneous ΔE_solv and force prediction.

A plausible implication is that further scaling of equivariant architectures may deliver diminishing returns if sufficient task-specific geometric priors are incorporated.

6. Toolkit and Workflow for OC25 Model Development

The Open MatSci ML Toolkit (Miret et al., 2022) provides a unified Python-based infrastructure integrating PyTorch Lightning (for automated scaling, device abstraction) and Deep Graph Library (DGL; for graph construction and batched/distributed learning):

Architecture: Clear separation of model, data, and training logic using LightningDataModule, AbstractEnergyModel, and LightningModule classes.
DGL Integration: Flexible per-node/edge feature dictionaries; seamless support for CPUs, GPUs, XPUs; efficient batched/distributed computation.
Supported Models: Ready-to-use implementations of state-of-the-art equivariant architectures (E(n)-GNN, MegNet, Gala).
Message Passing and Losses: Implements generic MPNN layers, E(n)-equivariant updates, L1/MAE losses for S2EF and IS2RE tasks, and physical force consistency $F = -\nabla E$ where required.
Scaling/Performance: Near-linear multi-node scaling demonstrated; e.g., MegNet achieves S2EF validation MAE of 0.378 eV/Å on 2M samples—matching GemNet-XL accuracy trained on much larger datasets.
Code Example:

trainer = pl.Trainer(
    max_epochs=100,
    accelerator="gpu", devices=8, num_nodes=4,
    strategy="ddp", precision="bf16",
    callbacks=[EarlyStopping(...), ModelCheckpoint(...)]
)

This infrastructure enables reproducible, extensible, and scalable participation in OC25 tasks; editor’s term: "OC25 ecosystem".

7. Scientific Opportunities and Outlook

OC25’s explicit solvent/ion modeling, scale, and diverse benchmarks expand the scientific scope of ML atomistic simulation in several dimensions:

Functional Interface Chemistry: OC25 enables the quantification and prediction of solvent effects, competitive adsorption, double-layer structure, and electrochemical dynamics—critical for design of CO₂ reduction, hydrogen evolution, and related energy-conversion catalysts.
Solvation Energy Prediction: Direct ML regression of $\Delta E_{\mathrm{solv}}$ empowers rapid screening of solvent/ion compositions and formulations.
Modeling at Longer Time/Length Scales: OC25-trained MLIPs enable orders-of-magnitude faster MD than DFT, which can be critical for sampling of rare events and dynamics at solid–liquid interfaces.
Extension to Multi-Physics and Grand-Canonical Dynamics: By integrating data such as AQCat25, future architectures can target spin, fidelity, and grand-canonical (applied potential) regimes.
Open-Source Distribution: All OC25 data and models are openly available (https://huggingface.co/facebook/OC25), enabling community-driven improvement, rapid validation, and transfer to emerging catalysis and energy materials domains.

This suggests that OC25 provides both an immediate benchmark for functional interface MLIP development and a foundation for future advances, as researchers extend and specialize models for new regions of chemical and physical space.