Probing the effects of broken symmetries in machine learning (2406.17747v1)

Published 25 Jun 2024 in physics.chem-ph, cs.LG, and stat.ML

Abstract: Symmetry is one of the most central concepts in physics, and it is no surprise that it has also been widely adopted as an inductive bias for machine-learning models applied to the physical sciences. This is especially true for models targeting the properties of matter at the atomic scale. Both established and state-of-the-art approaches, with almost no exceptions, are built to be exactly equivariant to translations, permutations, and rotations of the atoms. Incorporating symmetries -- rotations in particular -- constrains the model design space and implies more complicated architectures that are often also computationally demanding. There are indications that non-symmetric models can easily learn symmetries from data, and that doing so can even be beneficial for the accuracy of the model. We put a model that obeys rotational invariance only approximately to the test, in realistic scenarios involving simulations of gas-phase, liquid, and solid water. We focus specifically on physical observables that are likely to be affected -- directly or indirectly -- by symmetry breaking, finding negligible consequences when the model is used in an interpolative, bulk, regime. Even for extrapolative gas-phase predictions, the model remains very stable, even though symmetry artifacts are noticeable. We also discuss strategies that can be used to systematically reduce the magnitude of symmetry breaking when it occurs, and assess their impact on the convergence of observables.

Citations (3)

View on Semantic Scholar

Summary

The paper shows that relaxing rotational invariance with the PET architecture preserves energy conservation in gas-phase water while simplifying model complexity.
The paper finds that bulk water properties, including diffusion and correlation functions, remain largely unaffected by symmetry relaxation when using rotational averaging.
The paper demonstrates that solid-phase ice models maintain accurate force and energy predictions despite approximate rotational invariance, enabling efficient model design.

Probing the Effects of Broken Symmetries in Machine Learning

The paper "Probing the effects of broken symmetries in machine learning" addresses a critical and nuanced issue within the field of machine learning applied to atomic-scale modeling. The research explores the implications of relaxing the requirement for exact rotational invariance in machine learning models, particularly those used for predicting physical properties of matter.

Overview and Methodology

Machine learning interatomic potentials (MLPs), especially those designed to model potential-energy surfaces (PES), typically incorporate symmetry constraints to reflect inherent physical properties. The common practice ensures that models are invariant to atom label permutations, translations, rotations, and reflections. The researchers argue that incorporating such symmetries, especially rotational invariance, introduces complexities in model architecture and computational overhead.

To probe the adequacy of approximate symmetry, the authors employ the Point Edge Transformer (PET) architecture. PET maintains translation and permutation invariance but relaxes rigid rotational invariance. The paper uses PET to train an MLP on the simulation of different phases of water, including gas-phase, liquid, and solid states. Notably, the training set consists of configurations computed at the revPBE0 level of theory, augmented with D3 dispersion corrections.

Key Findings

Rotational Symmetry Breaking in Gas-Phase Water:
- The isolated water molecule simulations exhibit clear evidence of angular momentum precession due to the non-zero torque, indicating broken rotational symmetry.
- Despite these symmetry violations, the energy conservation during molecular dynamics (MD) simulations remains consistent, even in extrapolative gas-phase conditions.
- Rotational averaging techniques dramatically reduce symmetry-breaking artifacts, evidenced by reduced torque and angular momentum invariance.
Behavior in Bulk Water:
- Simulations reveal that thermodynamic properties of bulk water, both static (pair correlation functions) and dynamic (diffusion coefficients), are minimally affected by the approximate rotational invariance of the PET model.
- The free energy profiles under constant temperature conditions show barely perceptible anisotropy, which further diminishes with rotational averaging.
Solid-Phase Ice Properties:
- Investigating proton-disordered ice structures reveals that the forces and relative energy predictions maintain robust accuracy. The deviations between raw and rotationally averaged PET models are minimal, implicating negligible effects of broken rotational symmetry in these contexts.

Practical and Theoretical Implications

The implications of this research are significant for both theoretical and practical applications in computational physics and materials science. Here are some observational insights:

Relaxation of Symmetry Constraints: By demonstrating that the non-symmetric model can implicitly learn and approximate rotational invariance through data augmentation, the research indicates potential to simplify model architectures and reduce computational burdens without compromising accuracy.
Rotational Augmentation as a Strategy: Introducing random rotations during training (data augmentation) and averaging over rotations during inference offers a practical workaround to enforce approximate rotational invariance, safeguarding against potential discrepancies in dynamic simulations.
Inference-Time Rotational Averaging: The paper posits inference-time rotational averaging methods as effective techniques to ensure accurate and physically consistent predictions, even for models that are not strictly equivariant by design.

Future Directions

The findings encourage further exploration into flexible model architectures that can balance between computational efficiency and adherence to physical symmetries. Potential future directions include:

Development of training loss functions that explicitly penalize symmetry-breaking to enhance model robustness without stringent architectural constraints.
Expansion of the PET architecture's application to a wider array of materials and molecular systems, validating the generality of the observed phenomena.
Exploration of other non-invariance relaxation techniques in combination with various neural network architectures to push the boundaries of computational chemistry and materials predictions.

In conclusion, this investigation contributes to a nuanced understanding of how machine learning models can effectively balance physical principles and computational pragmatism. The exploration of approximate invariance, particularly rotational invariance, opens pathways for more efficient and versatile modeling techniques in the physical sciences.

Related Papers

Tweets

https://twitter.com/lab_COSMO/status/1805860679510872519

https://twitter.com/marceldotsci/status/1805854172597686573

https://twitter.com/marceldotsci/status/1831083347415846959

https://twitter.com/StatMLPapers/status/1805814369642541385

https://twitter.com/fly51fly/status/1807525528418808289

https://twitter.com/arxivsanitybot/status/1806150231639109829