Does equivariance matter at scale? (2410.23179v1)
Abstract: Given large data sets and sufficient compute, is it beneficial to design neural architectures for the structure and symmetries of each problem? Or is it more efficient to learn them from data? We study empirically how equivariant and non-equivariant networks scale with compute and training samples. Focusing on a benchmark problem of rigid-body interactions and on general-purpose transformer architectures, we perform a series of experiments, varying the model size, training steps, and dataset size. We find evidence for three conclusions. First, equivariance improves data efficiency, but training non-equivariant models with data augmentation can close this gap given sufficient epochs. Second, scaling with compute follows a power law, with equivariant models outperforming non-equivariant ones at each tested compute budget. Finally, the optimal allocation of a compute budget onto model size and training duration differs between equivariant and non-equivariant models.
- Accurate structure prediction of biomolecular interactions with alphafold 3. Nature, pp. 1–3, 2024.
- Scaling and generalization in neural networks: a case study. Advances in neural information processing systems, 1, 1988.
- Getting vit in shape: Scaling laws for compute-optimal model design. arXiv preprint arXiv:2305.13035, 2023.
- Learning rigid dynamics with face interaction graph networks. arXiv preprint arXiv:2212.03574, 2022.
- S-l Amari. Feature spaces which admit and detect invariant signal transformations. In Proc. 4th Int. Joint Conf. Pattern Recognition, pp. 452–456, 1978.
- Adaptive input representations for neural language modeling. arXiv:1809.10853, 2018.
- Explaining neural scaling laws. arXiv preprint arXiv:2102.06701, 2021.
- MACE: Higher order equivariant message passing neural networks for fast and accurate force fields. Advances in Neural Information Processing Systems, 35:11423–11436, 2022.
- A foundation model for atomistic materials chemistry. arXiv preprint arXiv:2401.00096, 2023.
- E(3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials. Nature communications, 13(1):2453, 2022.
- A probabilistic data-driven model for planar pushing. In 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 3008–3015. IEEE, 2017.
- A pac-bayesian generalization bound for equivariant networks. Advances in Neural Information Processing Systems, 35:5654–5668, 2022.
- Roto-translation covariant convolutional networks for medical image analysis. In Medical Image Computing and Computer Assisted Intervention–MICCAI 2018: 21st International Conference, Granada, Spain, September 16-20, 2018, Proceedings, Part I, pp. 440–448. Springer, 2018.
- Pelican: permutation equivariant and lorentz invariant or covariant aggregator network for particle physics. arXiv preprint arXiv:2211.00454, 2022.
- Sampling using su (n) gauge equivariant flows. Physical Review D, 103(7):074504, 2021.
- Geometric and physical quantities improve E(3) equivariant message passing. In International Conference on Learning Representations, 2022.
- Geometric Algebra Transformer. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin (eds.), Advances in Neural Information Processing Systems, volume 37, 2023.
- Edgi: Equivariant diffusion for planning with embodied agents. Advances in Neural Information Processing Systems, 36, 2024.
- Geometric deep learning: Grids, groups, graphs, geodesics, and gauges. 2021.
- Group equivariant convolutional networks. In International Conference on Machine Learning, pp. 2990–2999. PMLR, 2016.
- Pybullet, a python module for physics simulation for games, robotics and machine learning. http://pybullet.org, 2016–2024.
- Flashattention: Fast and memory-efficient exact attention with io-awareness. Advances in Neural Information Processing Systems, 35:16344–16359, 2022.
- Euclidean, projective, conformal: Choosing a geometric algebra for equivariant transformers. In Proceedings of the 27th International Conference on Artificial Intelligence and Statistics, volume 27, 2024. URL https://arxiv.org/abs/2311.04744.
- Equivariant amortized inference of poses for cryo-em. arXiv preprint arXiv:2406.01630, 2024.
- Leo Dorst. A guided tour to the plane-based geometric algebra pga. 2020. URL https://geometricalgebra.org/downloads/PGA4CS.pdf.
- The llama 3 herd of models. arXiv preprint arXiv:2407.21783, 2024.
- Provably strict generalisation benefit for equivariant models. In International conference on machine learning, pp. 2959–2969. PMLR, 2021.
- Learning lattice quantum field theories with equivariant continuous flows. SciPost Physics, 15(6):238, 2023.
- An efficient lorentz equivariant graph neural network for jet tagging. Journal of High Energy Physics, 2022(7):1–22, 2022.
- Kubric: A scalable dataset generator. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3749–3761, 2022.
- Probabilistic and differentiable wireless simulation with geometric transformers. arXiv preprint arXiv:2406.14995, 2024.
- Scaling laws for autoregressive generative modeling. arXiv preprint arXiv:2010.14701, 2020.
- Deep-neural-network solution of the electronic schrödinger equation. Nature Chemistry, 12(10):891–897, 2020.
- Deep learning scaling is predictable, empirically. arXiv preprint arXiv:1712.00409, 2017.
- Training Compute-Optimal large language models. March 2022.
- Peter J Huber. Robust estimation of a location parameter. In Breakthroughs in statistics: Methodology and distribution, pp. 492–518. Springer, 1992.
- Equivariant 3d-conditional diffusion model for molecular linker design. Nature Machine Intelligence, pp. 1–11, 2024.
- Andy L Jones. Scaling scaling laws with board games. arXiv preprint arXiv:2104.03113, 2021.
- Scaling laws for neural language models. January 2020.
- Diederik P Kingma. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
- Equiformer: Equivariant graph attention transformer for 3d atomistic graphs. arXiv preprint arXiv:2206.11990, 2022.
- Clifford group equivariant simplicial message passing networks. arXiv preprint arXiv:2402.10011, 2024a.
- Multivector neurons: Better and faster o (n)-equivariant clifford graph neural networks. arXiv preprint arXiv:2406.04052, 2024b.
- On the limited memory bfgs method for large scale optimization. Mathematical programming, 45(1):503–528, 1989.
- On the benefits of invariance in neural networks. arXiv preprint arXiv:2005.00178, 2020.
- Correspondence-free structure from motion. International Journal of Computer Vision, 75(3):311–327, 2007.
- A data and compute efficient design for limited-resources deep learning. arXiv preprint arXiv:2004.09691, 2020.
- Scaling data-constrained language models. arXiv preprint arXiv:2305.16264, 2023.
- Learning local equivariant representations for large-scale atomistic dynamics. Nature Communications, 14(1):579, 2023.
- Approximation-generalization trade-offs under (approximate) group equivariance. Advances in Neural Information Processing Systems, 36, 2024.
- Ab initio solution of the many-electron schrödinger equation with deep neural networks. Physical review research, 2(3):033429, 2020.
- Contactnets: Learning discontinuous contact dynamics with smooth, implicit representations. In Conference on Robot Learning, pp. 2279–2291. PMLR, 2021.
- Compute better spent: Replacing dense layers with structured matrices. arXiv preprint arXiv:2406.06248, 2024.
- A constructive prediction of the generalization error across scales. arXiv preprint arXiv:1909.12673, 2019.
- Learning rigid-body simulators over implicit shapes for large-scale scenes and vision. arXiv preprint arXiv:2405.14045, 2024.
- Clifford group equivariant neural networks. In Advances in Neural Information Processing Systems, volume 37, 2023a.
- Geometric clifford algebra networks. In International Conference on Machine Learning, 2023b.
- Improved generalization bounds of group invariant/equivariant deep networks via quotient feature spaces. In Uncertainty in artificial intelligence, pp. 771–780. PMLR, 2021.
- Noam Shazeer. Fast transformer decoding: One write-head is all you need. arXiv preprint arXiv:1911.02150, 2019.
- Generalization error of invariant classifiers. In Artificial Intelligence and Statistics, pp. 1094–1103. PMLR, 2017.
- Lorentz-equivariant geometric algebra transformers for high-energy physics. 2024.
- Mesh neural networks for se (3)-equivariant hemodynamics estimation on the artery wall. Computers in Biology and Medicine, 173:108328, 2024.
- Fourier features let networks learn high frequency functions in low dimensional domains. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin (eds.), Advances in Neural Information Processing Systems, volume 33, pp. 7537–7547. Curran Associates, Inc., 2020. URL https://proceedings.neurips.cc/paper_files/paper/2020/file/55053683268957697aa39fba6f231c68-Paper.pdf.
- Scaling laws vs model architectures: How does inductive bias influence scaling? July 2022.
- Attention Is All You Need. NeurIPS, 2017.
- Rotation equivariant CNNs for digital pathology. In Medical Image Computing and Computer Assisted Intervention–MICCAI 2018: 21st International Conference, Granada, Spain, September 16-20, 2018, Proceedings, Part II 11, pp. 210–218. Springer, 2018.
- On-robot learning with equivariant models. arXiv preprint arXiv:2203.04923, 2022a.
- SO(2)-equivariant reinforcement learning. arXiv preprint arXiv:2203.04439, 2022b.
- Equivariant q𝑞qitalic_q learning in spatial action spaces. In Conference on Robot Learning, pp. 1713–1723. PMLR, 2022c.
- Generating molecular conformer fields. arXiv preprint arXiv:2311.17932, 2023.
- 3d G-CNNs for pulmonary nodule detection. arXiv preprint arXiv:1804.04656, 2018.
- Improved semantic segmentation for histopathology using rotation equivariant convolutional networks. 2018.
- Representation theory and invariant neural networks. Discrete applied mathematics, 69(1-2):33–60, 1996.
- Mattergen: a generative model for inorganic materials design. arXiv preprint arXiv:2312.03687, 2023.
- Clifford-steerable convolutional neural networks. arXiv preprint arXiv:2402.14730, 2024.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.