Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 152 tok/s
Gemini 2.5 Pro 41 tok/s Pro
GPT-5 Medium 22 tok/s Pro
GPT-5 High 24 tok/s Pro
GPT-4o 94 tok/s Pro
Kimi K2 212 tok/s Pro
GPT OSS 120B 430 tok/s Pro
Claude Sonnet 4.5 36 tok/s Pro
2000 character limit reached

Optimization Dynamics of Equivariant and Augmented Neural Networks (2303.13458v5)

Published 23 Mar 2023 in cs.LG and math.OC

Abstract: We investigate the optimization of neural networks on symmetric data, and compare the strategy of constraining the architecture to be equivariant to that of using data augmentation. Our analysis reveals that that the relative geometry of the admissible and the equivariant layers, respectively, plays a key role. Under natural assumptions on the data, network, loss, and group of symmetries, we show that compatibility of the spaces of admissible layers and equivariant layers, in the sense that the corresponding orthogonal projections commute, implies that the sets of equivariant stationary points are identical for the two strategies. If the linear layers of the network also are given a unitary parametrization, the set of equivariant layers is even invariant under the gradient flow for augmented models. Our analysis however also reveals that even in the latter situation, stationary points may be unstable for augmented training although they are stable for the manifestly equivariant models.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (28)
  1. Jimmy Aronsson. Homogeneous vector bundles and G-equivariant convolutional neural networks. Sampling Theory, Signal Processing, and Data Analysis, 20(10), 2022.
  2. Layer normalization. arXiv preprint arXiv:1607.06450, 2016.
  3. Learning deep linear neural networks: Riemannian gradient flows and convergence to global minimizers. Information and Inference: A Journal of the IMA, 11(1):307–353, 2022.
  4. G. Bradski. The OpenCV Library. Dr. Dobb’s Journal of Software Tools, 2000.
  5. Geometric deep learning: going beyond euclidean data. IEEE Signal Processing Magazine, 34(4):18–42, 2017.
  6. Geometric deep learning: Grids, groups, graphs, geodesics, and gauges. arXiv preprint arXiv:2104.13478, 2021.
  7. A group-theoretic framework for data augmentation. The Journal of Machine Learning Research, 21(1):9885–9955, 2020.
  8. On the implicit bias of linear equivariant steerable networks: Margin, generalization, and their equivalence to data augmentation. arXiv preprint arXiv:2303.04198, 2023.
  9. Group equivariant convolutional networks. In Proceedings of the 33rd International Conference on Machine Learning, pages 2990–2999. PMLR, 2016.
  10. A general theory of equivariant CNNs on homogeneous spaces. Advances in Neural Information Processing Systems, 32, 2019.
  11. A kernel theory of modern data augmentation. In Proceedings of the 36th International Conference on Machine Learning, pages 1528–1537. PMLR, 2019.
  12. Provably strict generalisation benefit for equivariant models. In Proceedings of the 38th International Conference on Machine Learning, pages 2959–2969. PMLR, 2021.
  13. A practical method for constructing equivariant multilayer perceptrons for arbitrary matrix groups. In Proceedings of the 38th International Conference on Machine Learning, pages 3318–3328. PMLR, 2021.
  14. Training or architecture? How to incorporate invariance in neural networks. arXiv:2106.10044, 2021.
  15. Equivariance versus augmentation for spherical images. In Proceedings of the 39th International Conference on Machine Learning, pages 7404–7421. PMLR, 2022.
  16. On the generalization of equivariance and convolution in neural networks to the action of compact groups. In Proceedings of the 35th International Conference on Machine Learning, pages 2747–2755. PMLR, 2018.
  17. Geometric integration theory. Springer Science & Business Media, 2008.
  18. Implicit bias of linear equivariant networks. In International Conference on Machine Learning, pages 12096–12125. PMLR, 2022.
  19. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998.
  20. An analysis of the effect of invariance on generalization in neural networks. In ICML2019 Workshop on Understanding and Improving Generalization in Deep Learning, 2019.
  21. On the benefits of invariance in neural networks. arXiv:2005.00178, 2020.
  22. Invariant and equivariant graph networks. In Proceedings of the 7th International Conference on Learning Representations, 2019a.
  23. On the universality of invariant networks. In Proceedings of the 36th International Conference on Machine Learning, pages 4363–4371. PMLR, 2019b.
  24. Learning with invariances in random features and kernel models. In Proceedings of the 34th Conference on Learning Theory, pages 3351–3418. PMLR, 2021.
  25. Rotation-equivariant deep learning for diffusion MRI. arXiv:2102.06942, 2021.
  26. Pytorch: An imperative style, high-performance deep learning library. Advances in Neural Information Processing Systems, 32, 2019.
  27. General E(2)-equivariant steerable CNNs. Advances in Neural Information Processing Systems, 32, 2019.
  28. Learning steerable filters for rotation equivariant CNNs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 849–858, 2018.
Citations (3)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Questions

We haven't generated a list of open questions mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 2 tweets and received 0 likes.

Upgrade to Pro to view all of the tweets about this paper: