Effect of "Rattled" Structures in OMat24 on Universal MLIPs and Architecture Dependence
Determine whether including a large proportion of rattled structures in the OMat24 dataset (approximately 45% of the data) and the magnitude of the applied rattling is generally beneficial for the current generation of universal machine-learning interatomic potentials, or whether the adverse behaviors observed when training on the full OMat24 dataset are unique to more unconstrained MLIP architectures.
References
Whilst we broadly in favour of retaining as much of a model's training data as possible, it remains unclear if the large proportion of "rattled" systems in OMat24 (45% of the data), and the amount by which they are rattled, is generally beneficial or not for the current generation of universal MLIPs, or whether the problems we have observed are unique to more unconstrained architectures.