Towards fully covariant machine learning (2301.13724v2)
Abstract: Any representation of data involves arbitrary investigator choices. Because those choices are external to the data-generating process, each choice leads to an exact symmetry, corresponding to the group of transformations that takes one possible representation to another. These are the passive symmetries; they include coordinate freedom, gauge symmetry, and units covariance, all of which have led to important results in physics. In machine learning, the most visible passive symmetry is the relabeling or permutation symmetry of graphs. Our goal is to understand the implications for machine learning of the many passive symmetries in play. We discuss dos and don'ts for machine learning practice if passive symmetries are to be respected. We discuss links to causal modeling, and argue that the implementation of passive symmetries is particularly valuable when the goal of the learning problem is to generalize out of sample. This paper is conceptual: It translates among the languages of physics, mathematics, and machine-learning. We believe that consideration and implementation of passive symmetries might help machine learning in the same ways that it transformed physics in the twentieth century.
- Geometric considerations for normalization layers in equivariant neural networks. In AI for Accelerated Materials Design NeurIPS 2022 Workshop, 2022. URL https://openreview.net/forum?id=p9fKD1sFog8.
- Dimensionally consistent learning with buckingham pi. arXiv preprint arXiv:2202.04643, 2022.
- E(3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials. Nature communications, 13(1):1–11, 2022.
- On the sample complexity of learning with geometric stability. arXiv:2106.07148, 2021.
- Equivariant maps from invariant functions. arXiv preprint arXiv:2209.14991, 2022.
- A training algorithm for optimal margin classifiers. In Proceedings of the fifth annual workshop on Computational learning theory, pp. 144–152, 1992.
- Geometric deep learning: Grids, groups, graphs, geodesics, and gauges. arXiv preprint arXiv:2104.13478, 2021.
- Spectral networks and locally connected networks on graphs. arXiv:1312.6203, 2013.
- Function classes for identifiable nonlinear independent component analysis. In Advances in Neural Information Processing Systems 35 (NeurIPS 2022). Curran Associates, Inc., December 2022.
- Fast neural poincaré maps for toroidal magnetic fields. Plasma Physics and Controlled Fusion, 63(2), 12 2020. doi: 10.1088/1361-6587/abcbaa.
- A group-theoretic framework for data augmentation. The Journal of Machine Learning Research, 21(1):9885–9955, 2020.
- Covariance in physics and convolutional neural networks. arXiv:1906.02481, 2019.
- Group equivariant convolutional networks. In International conference on machine learning, pp. 2990–2999. PMLR, 2016.
- R. Courant and D. Hilbert. Methods of Mathematical Physics, volume 1. Interscience Publishers, Inc, New York, 1953.
- Sharp minima can generalize for deep nets. arXiv, 1703.04933, 2017.
- Convolutional networks on graphs for learning molecular fingerprints. In Neural Information Processing systems, pp. 2224–2232, 2015.
- Lost in the tensors: Einstein’s struggles with covariance principles 1912–1916. Studies in History and Philosophy of Science Part A, 9(4):251–278, 1978.
- Albert Einstein. Die feldgleichungen der gravitation. Sitzungsberichte der Preussischen Akademie der Wissenschaften zu Berlin, pp. 844–847, 1915.
- Albert Einstein. Die Grundlage der allgemeinen Relativitätstheorie. Annalen der Physik, 354(7):769–822, January 1916. doi: 10.1002/andp.19163540702.
- Bryn Elesedy. Provably strict generalisation benefit for invariance in kernel methods. arXiv:2106.02346, 2021.
- Provably strict generalisation benefit for equivariant models. arXiv:2102.10333, 2021.
- Generalizing convolutional neural networks for equivariance to lie groups on arbitrary continuous data. In International Conference on Machine Learning, pp. 3165–3176. PMLR, 2020.
- A practical method for constructing equivariant multilayer perceptrons for arbitrary matrix groups. arXiv:2104.09459, 2021.
- e3nn: Euclidean neural networks. arXiv:2207.09453, 2022.
- Alan E. Gelfand. Gibbs sampling. Journal of the American Statistical Association, 95(452):1300–1304, 2000.
- Neural message passing for quantum chemistry. In Proceedings of the 34th International Conference on Machine Learning-Volume 70, pp. 1263–1272. JMLR. org, 2017.
- William L Hamilton. Graph representation learning. Synthesis Lectures on Artifical Intelligence and Machine Learning, 14(3):1–159, 2020.
- W. K. Hastings. Monte carlo sampling methods using markov chains and their applications. Biometrika, 57(1):97–109, 1970. ISSN 00063444.
- Flat minima. Neural Comput., 9(1):1–42, 1997.
- Quantifying the effects of data augmentation. arXiv preprint arXiv:2202.09134, 2022.
- Sympnets: Intrinsic structure-preserving symplectic networks for identifying hamiltonian systems. Neural Networks, 132(C), 8 2020. doi: 10.1016/j.neunet.2020.08.017.
- ICE-BeeM: Identifiable conditional energy-based deep models based on nonlinear ICA. In Advances in Neural Information Processing Systems 33, 2020.
- On the generalization of equivariance and convolution in neural networks to the action of compact groups. Proceedings of the 35th International Conference on Machine Learning, 2018.
- Implicit bias of linear equivariant networks. arXiv:2110.06084, 2021.
- Backpropagation applied to handwritten zip code recognition. Neural Computation, 1(4):541–551, 1989.
- Learning with invariances in random features and kernel models. arXiv:2102.13219, 2021.
- Conor Muldoon. DeepMind’s EigenGame is not a game! medium, May 2021. https://towardsdatascience.com/deepminds-eigengame-is-not-a-game-f580b3709316.
- Learning local equivariant representations for large-scale atomistic dynamics. arXiv:2204.05249, 2022.
- Emmy Noether. (translated and republished as) Invariant variation problems. Transport Theory and Statistical Physics, 1(3):186–207(1971), 1918.
- Elements of Causal Inference - Foundations and Learning Algorithms. MIT Press, Cambridge, MA, USA, 2017. ISBN 978-0-262-03731-0.
- Approximation-generalization trade-offs under (approximate) group equivariance. arXiv preprint arXiv:2305.17592, 2023.
- Relative flatness and generalization. In Advances in Neural Information Processing Systems. Curran Associates, Inc., 2021.
- Max Planck. On the law of the energy distribution in the normal spectrum. Ann. Phys, 4(553):1–11, 1901.
- M. M. G. Ricci and T. Levi-Civita. Méthodes de calcul différentiel absolu et leurs applications. Mathematische Annalen, 54(1):125–201, 1900.
- Invariant models for causal transfer learning. Journal of Machine Learning Research, 19(36):1–34, 2018.
- Loop quantum gravity and the meaning of diffeomorphism invariance. In Towards quantum gravity, pp. 277–324. Springer, 2000.
- B. Schölkopf and A. J. Smola. Learning with Kernels. MIT Press, Cambridge, MA, USA, 2002.
- Equibind: Geometric deep learning for drug binding structure prediction. In International Conference on Machine Learning, pp. 20503–20521. PMLR, 2022.
- Tensor field networks: Rotation-and translation-equivariant neural networks for 3d point clouds. arXiv:1802.08219, 2018.
- Modern Classical Physics: Optics, Fluids, Plasmas, Elasticity, Relativity, and Statistical Physics. Princeton University Press, 2017. ISBN 9781400848898.
- Scalars are universal: Equivariant machine learning, structured like classical physics. Advances in Neural Information Processing Systems, 34:28848–28863, 2021.
- Dimensionless machine learning: Imposing exact units equivariance. arXiv:2204.00887, 2022.
- A compression approach to support vector model selection. J. Mach. Learn. Res., 5:293–323, 2004.
- Approximately equivariant networks for imperfectly symmetric dynamics. arXiv:2201.11969, 2022.
- Coordinate independent convolutional networks - isometry and gauge equivariant convolutions on riemannian manifolds. arXiv, 2106.06020, 2021.
- Hermann Weyl. The Classical Groups. Princeton University Press, 1946.
- Christopher K. I. Williams and Carl Edward Rasmussen. Gaussian processes for machine learning. MIT press (Cambridge, MA), 2006.
- Data-driven discovery of dimensionless numbers and governing laws from scarce measurements. Nature Communications, 13(1):7562, 2022.
- A simple equivariant machine learning method for dynamics based on scalars. arXiv, 2110.03761, 2021.
- Physics-guided ai for large-scale spatiotemporal data. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, pp. 4088–4089, 2021.
- Anthony Zee. Group theory in a nutshell for physicists, volume 17. Princeton University Press, 2016.