Papers
Topics
Authors
Recent
2000 character limit reached

A singular Riemannian Geometry Approach to Deep Neural Networks III. Piecewise Differentiable Layers and Random Walks on $n$-dimensional Classes

Published 9 Apr 2024 in math.DG and cs.LG | (2404.06104v1)

Abstract: Neural networks are playing a crucial role in everyday life, with the most modern generative models able to achieve impressive results. Nonetheless, their functioning is still not very clear, and several strategies have been adopted to study how and why these model reach their outputs. A common approach is to consider the data in an Euclidean settings: recent years has witnessed instead a shift from this paradigm, moving thus to more general framework, namely Riemannian Geometry. Two recent works introduced a geometric framework to study neural networks making use of singular Riemannian metrics. In this paper we extend these results to convolutional, residual and recursive neural networks, studying also the case of non-differentiable activation functions, such as ReLU. We illustrate our findings with some numerical experiments on classification of images and thermodynamic problems.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (64)
  1. “Deep Learning Techniques to Improve Intraoperative Awareness Detection from Electroencephalographic Signals” In 2020 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), 2020, pp. 142–145
  2. “Exploring Geometry of Blind Spots in Vision models” In Advances in Neural Information Processing Systems 36 Curran Associates, Inc., 2023, pp. 45920–45944 URL: https://proceedings.neurips.cc/paper_files/paper/2023/file/90043ebd68500f9efe84fedf860a64f3-Paper-Conference.pdf
  3. “The Geometry of Deep Networks: Power Diagram Subdivision” In ArXiv abs/1905.08443, 2019
  4. A Benfenati, A Catozzi and V Ruggiero “Neural blind deconvolution with Poisson data” In Inverse Problems 39.5 IOP Publishing, 2023, pp. 054003
  5. “A singular Riemannian geometry approach to Deep Neural Networks I. Theoretical foundations” In Neural Networks 158, 2023, pp. 331–343
  6. “A singular Riemannian geometry approach to deep neural networks II. Reconstruction of 1-D equivalence classes” In Neural Networks 158, 2023, pp. 344–358
  7. “Piece-wise Constant Image Segmentation with a Deep Image Prior Approach” In Scale Space and Variational Methods in Computer Vision Cham: Springer International Publishing, 2023, pp. 352–362
  8. “Interpreting Neural Networks through the Polytope Lens” In ArXiv abs/2211.12312, 2022
  9. “Learning shape correspondence with anisotropic convolutional neural networks” In Advances in neural information processing systems 29, 2016
  10. “Geometric Deep Learning: Going beyond Euclidean data” In IEEE Signal Processing Magazine 34.4, 2017, pp. 18–42
  11. “A Comprehensive Survey on Geometric Deep Learning” In IEEE Access 8, 2020, pp. 35929–35949
  12. “Combining Weighted Total Variation and Deep Image Prior for natural and medical image restoration via ADMM” In 2021 21st International Conference on Computational Science and Its Applications (ICCSA), 2021, pp. 39–46
  13. “Deep Neural Network Based Vehicle and Pedestrian Detection for Autonomous Driving: A Survey” In IEEE Transactions on Intelligent Transportation Systems 22.6, 2021, pp. 3234–3246
  14. Petr Chunaev “Community detection in node-attributed social networks: A survey” In Computer Science Review 37, 2020, pp. 100286
  15. “A Unified Architecture for Natural Language Processing: Deep Neural Networks with Multitask Learning” In Proceedings of the 25th International Conference on Machine Learning, ICML ’08 Helsinki, Finland: Association for Computing Machinery, 2008, pp. 160–167
  16. “Riemannian score-based generative modelling” In Advances in Neural Information Processing Systems 35, 2022, pp. 2406–2422
  17. “ImageNet: A large-scale hierarchical image database” In 2009 IEEE Conference on Computer Vision and Pattern Recognition, 2009, pp. 248–255
  18. Davide Evangelista, Elena Morotti and Elena Loli Piccolomini “RISING: A new framework for model-based few-view CT image reconstruction with deep learning” In Computerized Medical Imaging and Graphics 103, 2023, pp. 102156
  19. “Photo-real talking head with deep bidirectional LSTM” In 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2015, pp. 4884–4888
  20. Santiago Fernández, Alex Graves and Jürgen Schmidhuber “An application of recurrent neural networks to discriminative keyword spotting” In Proceedings of the 17th International Conference on Artificial Neural Networks, ICANN’07 Berlin, Heidelberg: Springer-Verlag, 2007, pp. 220–229
  21. “Diffusion models for constrained domains” In arXiv preprint arXiv:2304.05364, 2023
  22. “Biomedical Image Classification via Dynamically Early Stopped Artificial Neural Network” In Algorithms 15.10, 2022 URL: https://www.mdpi.com/1999-4893/15/10/386
  23. GeminiTeam “Gemini: A Family of Highly Capable Multimodal Models”, 2023 arXiv:2312.11805 [cs.CL]
  24. Mario Gleirscher, Anne E. Haxthausen and Jan Peleska “Probabilistic Risk Assessment of an Obstacle Detection System for GoA 4 Freight Trains” In FTSCS 2023 - Proceedings of the 9th ACM SIGPLAN International Workshop on Formal Techniques for Safety-Critical Systems, Co-located: SPLASH 2023, 2023, pp. 26–36
  25. “Offline Handwriting Recognition with Multidimensional Recurrent Neural Networks” In Advances in Neural Information Processing Systems 21 Curran Associates, Inc., 2008
  26. “Deep ReLU Networks Have Surprisingly Few Activation Patterns” In Neural Information Processing Systems, 2019
  27. “Principles of riemannian geometry in neural networks” In Advances in neural information processing systems 30, 2017
  28. Juncai He, Lin Li and Jinchao Xu “ReLU Deep Neural Networks from the Hierarchical Basis Perspective”, 2021 arXiv:2105.04156 [math.NA]
  29. “Identity mappings in deep residual networks” In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part IV 14, 2016, pp. 630–645 Springer
  30. Mikael Henaff, Joan Bruna and Yann LeCun “Deep convolutional networks on graph-structured data” In arXiv preprint arXiv:1506.05163, 2015
  31. “Long Short-Term Memory” In Neural Computation 9.8, 1997, pp. 1735–1780
  32. Lars Hörmander “The analysis of linear partial differential operators” Springer, 1983
  33. “Harnessing nonlinearity: predicting chaotic systems and saving energy in wireless telecommunication”, 2004
  34. “Origami in N dimensions: How feed-forward networks manufacture linear separability” In ArXiv abs/2203.11355, 2022
  35. “Universal approximation with deep narrow networks” In Conference on learning theory, 2020, pp. 2306–2327 PMLR
  36. Jan Kotera, Filip Sroubek and Václav Smidl “Improving Neural Blind Deconvolution” In 2021 IEEE International Conference on Image Processing (ICIP), 2021, pp. 1954–1958
  37. Ivano Lauriola, Alberto Lavelli and Fabio Aiolli “An introduction to Deep Learning in Natural Language Processing: Models, techniques, and tools” In Neurocomputing 470, 2022, pp. 443–456
  38. “Gradient-based learning applied to document recognition” In Proceedings of the IEEE 86.11, 1998, pp. 2278–2324
  39. Yann LeCun, Corinna Cortes and Christopher J.C. Burges “MNIST handwritten digit database”, http://yann.lecun.com/exdb/mnist/, 2010
  40. “Learning algorithms for classification: A comparison on handwritten digit recognition”, 1995
  41. “Constructing Long Short-Term Memory based Deep Recurrent Neural Networks for Large Vocabulary Speech Recognition”, 2015 arXiv:1410.4281 [cs.CL]
  42. “Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models”, 2024 arXiv:2402.17177 [cs.CV]
  43. Richard Melrose “Introduction to microlocal analysis” Microlocal Analysis course notes (MIT), 2003
  44. “Geometric Deep Learning on Graphs and Manifolds Using Mixture Model CNNs” In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 5425–5434
  45. Elliot W. Montroll “Random Walks in Multidimensional Spaces, Especially on Periodic Lattices” In Journal of the Society for Industrial and Applied Mathematics 4.4, 1956, pp. 241–260
  46. Aäron Oord, Sander Dieleman and Benjamin Schrauwen “Deep content-based music recommendation” In NIPS, 2013
  47. “A Stochastic Approach to Classification Error Estimates in Convolutional Neural Networks”, 2023 arXiv:2401.06156 [cs.CV]
  48. “Universal exploration dynamics of random walks” In Nature Communications 14, 2022
  49. “Improving language understanding with unsupervised learning” Technical report, OpenAI, 2018
  50. “On the Expressive Power of Deep Neural Networks” In International Conference on Machine Learning, 2016
  51. “Reverse-engineering deep ReLU networks” In Proceedings of the 37th International Conference on Machine Learning 119, Proceedings of Machine Learning Research PMLR, 2020, pp. 8178–8187
  52. Hasim Sak, Andrew W. Senior and Françoise Beaufays “Long short-term memory recurrent neural network architectures for large scale acoustic modeling” In INTERSPEECH, 2014, pp. 338–342
  53. Shaeke Salman, Md Montasir Bin Shams and Xiuwen Liu “Intriguing Equivalence Structures of the Embedding Space of Vision Transformers”, 2024 arXiv:2401.15568 [cs.CV]
  54. “Deep image prior for medical image denoising, a study about parameter initialization” In Frontiers in Applied Mathematics and Statistics 8 Frontiers, 2022, pp. 995225
  55. Hao Shen “A differential topological view of challenges in learning with feedforward neural networks” In arXiv preprint arXiv:1811.10304, 2018
  56. Ilya Sutskever, Oriol Vinyals and Quoc V. Le “Sequence to Sequence Learning with Neural Networks”, 2014 arXiv:1409.3215 [cs.CL]
  57. “BERT-DRE: BERT with Deep Recursive Encoder for Natural Language Sentence Matching” In ArXiv abs/2111.02188, 2021
  58. “Combined Cycle Power Plant”, UCI Machine Learning Repository, 2014
  59. “THE MNIST DATABASE of handwritten digits” URL: http://yann.lecun.com/exdb/mnist/
  60. Mirko Torrisi, Gianluca Pollastri and Quan Le “Deep learning methods in protein structure prediction” In Computational and Structural Biotechnology Journal 18, 2020, pp. 1301–1310
  61. “Forecasting Stock Prices from the Limit Order Book Using Convolutional Neural Networks” In 2017 IEEE 19th Conference on Business Informatics (CBI) 01, 2017, pp. 7–12
  62. Loring W. Tu “An Introduction to Manifolds” Springer-Verlag New York, 2011
  63. Dmitry Ulyanov, Andrea Vedaldi and Victor Lempitsky “Deep image prior” In Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 9446–9454
  64. Qiu Yi, Hanqing Xiong and Denghui Wang “Predicting Power Generation from a Combined Cycle Power Plant Using Transformer Encoders with DNN” In Electronics 12.11, 2023
Citations (1)

Summary

We haven't generated a summary for this paper yet.

Whiteboard

Paper to Video (Beta)

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 0 likes about this paper.