Optimization Dynamics of Equivariant and Augmented Neural Networks (2303.13458v5)
Abstract: We investigate the optimization of neural networks on symmetric data, and compare the strategy of constraining the architecture to be equivariant to that of using data augmentation. Our analysis reveals that that the relative geometry of the admissible and the equivariant layers, respectively, plays a key role. Under natural assumptions on the data, network, loss, and group of symmetries, we show that compatibility of the spaces of admissible layers and equivariant layers, in the sense that the corresponding orthogonal projections commute, implies that the sets of equivariant stationary points are identical for the two strategies. If the linear layers of the network also are given a unitary parametrization, the set of equivariant layers is even invariant under the gradient flow for augmented models. Our analysis however also reveals that even in the latter situation, stationary points may be unstable for augmented training although they are stable for the manifestly equivariant models.
- Jimmy Aronsson. Homogeneous vector bundles and G-equivariant convolutional neural networks. Sampling Theory, Signal Processing, and Data Analysis, 20(10), 2022.
- Layer normalization. arXiv preprint arXiv:1607.06450, 2016.
- Learning deep linear neural networks: Riemannian gradient flows and convergence to global minimizers. Information and Inference: A Journal of the IMA, 11(1):307–353, 2022.
- G. Bradski. The OpenCV Library. Dr. Dobb’s Journal of Software Tools, 2000.
- Geometric deep learning: going beyond euclidean data. IEEE Signal Processing Magazine, 34(4):18–42, 2017.
- Geometric deep learning: Grids, groups, graphs, geodesics, and gauges. arXiv preprint arXiv:2104.13478, 2021.
- A group-theoretic framework for data augmentation. The Journal of Machine Learning Research, 21(1):9885–9955, 2020.
- On the implicit bias of linear equivariant steerable networks: Margin, generalization, and their equivalence to data augmentation. arXiv preprint arXiv:2303.04198, 2023.
- Group equivariant convolutional networks. In Proceedings of the 33rd International Conference on Machine Learning, pages 2990–2999. PMLR, 2016.
- A general theory of equivariant CNNs on homogeneous spaces. Advances in Neural Information Processing Systems, 32, 2019.
- A kernel theory of modern data augmentation. In Proceedings of the 36th International Conference on Machine Learning, pages 1528–1537. PMLR, 2019.
- Provably strict generalisation benefit for equivariant models. In Proceedings of the 38th International Conference on Machine Learning, pages 2959–2969. PMLR, 2021.
- A practical method for constructing equivariant multilayer perceptrons for arbitrary matrix groups. In Proceedings of the 38th International Conference on Machine Learning, pages 3318–3328. PMLR, 2021.
- Training or architecture? How to incorporate invariance in neural networks. arXiv:2106.10044, 2021.
- Equivariance versus augmentation for spherical images. In Proceedings of the 39th International Conference on Machine Learning, pages 7404–7421. PMLR, 2022.
- On the generalization of equivariance and convolution in neural networks to the action of compact groups. In Proceedings of the 35th International Conference on Machine Learning, pages 2747–2755. PMLR, 2018.
- Geometric integration theory. Springer Science & Business Media, 2008.
- Implicit bias of linear equivariant networks. In International Conference on Machine Learning, pages 12096–12125. PMLR, 2022.
- Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998.
- An analysis of the effect of invariance on generalization in neural networks. In ICML2019 Workshop on Understanding and Improving Generalization in Deep Learning, 2019.
- On the benefits of invariance in neural networks. arXiv:2005.00178, 2020.
- Invariant and equivariant graph networks. In Proceedings of the 7th International Conference on Learning Representations, 2019a.
- On the universality of invariant networks. In Proceedings of the 36th International Conference on Machine Learning, pages 4363–4371. PMLR, 2019b.
- Learning with invariances in random features and kernel models. In Proceedings of the 34th Conference on Learning Theory, pages 3351–3418. PMLR, 2021.
- Rotation-equivariant deep learning for diffusion MRI. arXiv:2102.06942, 2021.
- Pytorch: An imperative style, high-performance deep learning library. Advances in Neural Information Processing Systems, 32, 2019.
- General E(2)-equivariant steerable CNNs. Advances in Neural Information Processing Systems, 32, 2019.
- Learning steerable filters for rotation equivariant CNNs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 849–858, 2018.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.