Construct Diverse, High-Quality Benchmark Test Sets for U-MLIPs
Develop systematic procedures for constructing diverse and high-quality benchmarking test sets for universal machine-learned interatomic potentials (U-MLIPs) that enable evaluation of physically meaningful properties such as surface energies, elastic moduli, and defect energetics, while mitigating risks of overfitting to implicit targets during model optimization.
References
These evaluations underscore the importance of assessing model performance not only in terms of energy and force errors, but also with respect to physically meaningful properties such as surface energies, elastic moduli, and defect energetics—quantities that are often more relevant for practical applications. However, constructing diverse and high-quality test sets for such evaluations remains an open challenge.