Engineered Ordinary Differential Equations as Classification Algorithm (EODECA): thorough characterization and testing (2312.14681v2)
Abstract: EODECA (Engineered Ordinary Differential Equations as Classification Algorithm) is a novel approach at the intersection of machine learning and dynamical systems theory, presenting a unique framework for classification tasks [1]. This method stands out with its dynamical system structure, utilizing ordinary differential equations (ODEs) to efficiently handle complex classification challenges. The paper delves into EODECA's dynamical properties, emphasizing its resilience against random perturbations and robust performance across various classification scenarios. Notably, EODECA's design incorporates the ability to embed stable attractors in the phase space, enhancing reliability and allowing for reversible dynamics. In this paper, we carry out a comprehensive analysis by expanding on the work [1], and employing a Euler discretization scheme. In particular, we evaluate EODECA's performance across five distinct classification problems, examining its adaptability and efficiency. Significantly, we demonstrate EODECA's effectiveness on the MNIST and Fashion MNIST datasets, achieving impressive accuracies of $98.06\%$ and $88.21\%$, respectively. These results are comparable to those of a multi-layer perceptron (MLP), underscoring EODECA's potential in complex data processing tasks. We further explore the model's learning journey, assessing its evolution in both pre and post training environments and highlighting its ability to navigate towards stable attractors. The study also investigates the invertibility of EODECA, shedding light on its decision-making processes and internal workings. This paper presents a significant step towards a more transparent and robust machine learning paradigm, bridging the gap between machine learning algorithms and dynamical systems methodologies.
- C. M. Bishop, Pattern recognition and machine learning, 5th Edition (Springer, 2007).
- S. Shalev-Shwartz and S. Ben-David, Understanding machine learning: From theory to algorithms (Cambridge university press, 2014).
- Y. LeCun, Y. Bengio, and G. Hinton, Deep learning, nature 521, 436 (2015).
- I. Goodfellow, Y. Bengio, and A. Courville, Deep learning (MIT press, 2016).
- S. J. Prince, Understanding Deep Learning (MIT press, 2023).
- J. B. Heaton, N. G. Polson, and J. H. Witte, Deep learning for finance: deep portfolios, Applied Stochastic Models in Business and Industry 33, 3 (2017).
- O. B. Sezer, M. U. Gudelek, and A. M. Ozbayoglu, Financial time series forecasting with deep learning: A systematic literature review: 2005–2019, Applied soft computing 90, 106181 (2020).
- R. Marino and N. Macris, Solving non-linear kolmogorov equations in large dimensions by using deep learning: a numerical comparison of discretization schemes, Journal of Scientific Computing 94, 8 (2023).
- R. Marino, Learning from survey propagation: a neural network for max-e-3-sat, Machine Learning: Science and Technology 2, 035032 (2021).
- R. Marino and F. Ricci-Tersenghi, Phase transitions in the mini-batch size for sparse and dense neural networks, arXiv preprint arXiv:2305.06435 (2023).
- R. Marino, G. Parisi, and F. Ricci-Tersenghi, The backtracking survey propagation algorithm for solving random k-sat problems, Nature communications 7, 12996 (2016a).
- C. Rudin, Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead, Nature machine intelligence 1, 206 (2019).
- C. Molnar, Interpretable machine learning (Lulu. com, 2020).
- E. Weinan, A proposal on machine learning via dynamical systems, Communications in Mathematics and Statistics 1, 1 (2017).
- K. Huang, Introduction to statistical physics (CRC press, 2009).
- R. Marino and E. Aurell, Advective-diffusive motion on large scales from small-scale dynamics with an internal symmetry, Physical Review E 93, 062147 (2016).
- R. Marino, R. Eichhorn, and E. Aurell, Entropy production of a brownian ellipsoid in the overdamped limit, Physical Review E 93, 012132 (2016b).
- M. Baldovin, R. Marino, and A. Vulpiani, Ergodic observables in non-ergodic systems: the example of the harmonic chain, Physica A: Statistical Mechanics and its Applications , 129273 (2023).
- A. Vulpiani, F. Cecconi, and M. Cencini, Chaos: from simple models to complex systems, Vol. 17 (World Scientific, 2009).
- E. Ott, Chaos in dynamical systems (Cambridge university press, 2002).
- B. Eshete, Making machine learning trustworthy, Science 373, 743 (2021).
- D. V. Anosov, V. I. Arnold, and D. Anosov, Dynamical systems I: ordinary differential equations and smooth dynamical systems (Springer, 1988).
- J. M. T. Thompson, H. B. Stewart, and R. Turner, Nonlinear dynamics and chaos, Computers in Physics 4, 562 (1990).
- D. Sherrington and S. Kirkpatrick, Solvable model of a spin-glass, Physical review letters 35, 1792 (1975).
- D. Panchenko, The sherrington-kirkpatrick model (Springer Science & Business Media, 2013).
- G. Nicolis, Introduction to nonlinear science (Cambridge university press, 1995).
- N. Bansal, X. Chen, and Z. Wang, Can we gain more from orthogonality regularizations in training deep networks?, Advances in Neural Information Processing Systems 31 (2018).
- A. Iserles, A first course in the numerical analysis of differential equations, 44 (Cambridge university press, 2009).
- R. Marino and S. Kirkpatrick, Hard optimization problems have soft edges, Scientific Reports 13, 3671 (2023a).
- R. Marino and S. Kirkpatrick, Large independent sets on random d-regular graphs with fixed degree d, Computation 11, 206 (2023b).
- M. Mézard, G. Parisi, and M. A. Virasoro, Spin glass theory and beyond: An Introduction to the Replica Method and Its Applications, Vol. 9 (World Scientific Publishing Company, 1987).
- J. J. Hopfield, Neural networks and physical systems with emergent collective computational abilities., Proceedings of the national academy of sciences 79, 2554 (1982).
- L. Deng, The mnist database of handwritten digit images for machine learning research, IEEE Signal Processing Magazine 29, 141 (2012).
- M. A. Belyaev and A. A. Velichko, Classification of handwritten digits using the hopfield network, IOP Conference Series: Materials Science and Engineering 862, 052048 (2020).
- C. Lucibello and M. Mézard, The exponential capacity of dense associative memories, arXiv preprint arXiv:2304.14964 (2023).
- J. J. Hopfield, Neurons with graded response have collective computational properties like those of two-state neurons., Proceedings of the national academy of sciences 81, 3088 (1984).
- J. Han, A. Jentzen, and W. E, Solving high-dimensional partial differential equations using deep learning, Proceedings of the National Academy of Sciences 115, 8505 (2018).
- H. Xiao, K. Rasul, and R. Vollgraf, Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms, arXiv preprint arXiv:1708.07747 (2017).
- A. Fachechi, A. Barra, E. Agliari, and F. Alemanno, Outperforming rbm feature-extraction capabilities by “dreaming” mechanism, IEEE Transactions on Neural Networks and Learning Systems (2022).
- S. Hochreiter and J. Schmidhuber, Flat minima, Neural computation 9, 1 (1997).