Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
120 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Algebraic Complexity and Neurovariety of Linear Convolutional Networks (2401.16613v1)

Published 29 Jan 2024 in math.AG and cs.LG

Abstract: In this paper, we study linear convolutional networks with one-dimensional filters and arbitrary strides. The neuromanifold of such a network is a semialgebraic set, represented by a space of polynomials admitting specific factorizations. Introducing a recursive algorithm, we generate polynomial equations whose common zero locus corresponds to the Zariski closure of the corresponding neuromanifold. Furthermore, we explore the algebraic complexity of training these networks employing tools from metric algebraic geometry. Our findings reveal that the number of all complex critical points in the optimization of such a network is equal to the generic Euclidean distance degree of a Segre variety. Notably, this count significantly surpasses the number of critical points encountered in the training of a fully connected linear network with the same number of parameters.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (24)
  1. A convergence theory for deep learning via over-parameterization. In Kamalika Chaudhuri and Ruslan Salakhutdinov, editors, Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, pages 242–252. PMLR, 09–15 Jun 2019.
  2. Pierre Baldi. Linear learning: Landscapes and algorithms. In D. S. Touretzky, editor, Advances in Neural Information Processing Systems 1, pages 65–72. Morgan-Kaufmann, 1989.
  3. Learning in linear neural networks: A survey. Trans. Neur. Netw., 6(4):837–858, July 1995.
  4. Metric Algebraic Geometry. Oberwolfach Seminars. Birkhäuser, Basel, 2024. In preparation, preliminary version available at https://kathlenkohn.github.io/Papers/MFO_Seminar_MAG.pdf.
  5. Optimization theory for ReLU neural networks trained with normalization layers. In Hal Daumé III and Aarti Singh, editors, Proceedings of the 37th International Conference on Machine Learning, volume 119 of Proceedings of Machine Learning Research, pages 2751–2760. PMLR, 13–18 Jul 2020.
  6. The Euclidean distance degree of an algebraic variety. Foundations of computational mathematics, 16(1):99–149, 2016.
  7. Gradient descent provably optimizes over-parameterized neural networks. In International Conference on Learning Representations, 2019.
  8. Macaulay 2, a system for computation in algebraic geometry and commutative algebra. faculty.math.illinois.edu/Macaulay2/.
  9. Audun Holme. The geometric and numerical properties of duality in projective algebraic geometry. Manuscripta mathematica, 61:145–162, 1988.
  10. Neural tangent kernel: Convergence and generalization in neural networks. In S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 31. Curran Associates, Inc., 2018.
  11. Kunio Kakié. The resultant of several homogeneous polynomials in two indeterminates. Proceedings of the American Mathematical Society, pages 1–7, 1976.
  12. Kenji Kawaguchi. Deep learning without poor local minima. In D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett, editors, Advances in Neural Information Processing Systems 29, pages 586–594. Curran Associates, Inc., 2016.
  13. Geometry of linear convolutional networks. SIAM Journal on Applied Algebra and Geometry, 6(3):368–406, 2022.
  14. On the minimal algebraic complexity of the rank-one approximation problem for general inner products. arXiv preprint arXiv:2309.15105, 2023.
  15. Function space and critical points of linear convolutional networks. arXiv preprint arXiv:2304.05752, 2023.
  16. Geometry of linear neural networks: Equivariance and invariance under permutation groups, 2024.
  17. Thomas Laurent and James von Brecht. Deep linear networks with arbitrary loss: All local minima are global. In Jennifer Dy and Andreas Krause, editors, Proceedings of the 35th International Conference on Machine Learning, volume 80 of Proceedings of Machine Learning Research, pages 2902–2907, Stockholmsmässan, Stockholm Sweden, 10–15 Jul 2018. PMLR.
  18. Defect of euclidean distance degree. Advances in Applied Mathematics, 121:102101, 2020.
  19. OEIS Foundation Inc. The On-Line Encyclopedia of Integer Sequences, 2024. Published electronically at http://oeis.org.
  20. Ragni Piene. Polar classes of singular varieties. Annales scientifiques de l’École Normale Supérieure, 11(2):247–276, 1978.
  21. Local minima in training of neural networks. arXiv preprint arXiv:1611.06310, 2016.
  22. Pure and spurious critical points: a geometric study of linear networks. In International Conference on Learning Representations, 2020.
  23. Effects of hidden layers on the efficiency of neural networks. In 2020 IEEE 23rd international multitopic conference (INMIC), pages 1–6. IEEE, 2020.
  24. Yi Zhou and Yingbin Liang. Critical points of linear neural networks: Analytical forms and landscape properties. In International Conference on Learning Representations, 2018.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com