Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 88 tok/s
Gemini 2.5 Pro 54 tok/s Pro
GPT-5 Medium 27 tok/s Pro
GPT-5 High 31 tok/s Pro
GPT-4o 90 tok/s Pro
Kimi K2 194 tok/s Pro
GPT OSS 120B 463 tok/s Pro
Claude Sonnet 4.5 36 tok/s Pro
2000 character limit reached

The Deep Arbitrary Polynomial Chaos Neural Network or how Deep Artificial Neural Networks could benefit from Data-Driven Homogeneous Chaos Theory (2306.14753v1)

Published 26 Jun 2023 in cs.NE and stat.ML

Abstract: Artificial Intelligence and Machine learning have been widely used in various fields of mathematical computing, physical modeling, computational science, communication science, and stochastic analysis. Approaches based on Deep Artificial Neural Networks (DANN) are very popular in our days. Depending on the learning task, the exact form of DANNs is determined via their multi-layer architecture, activation functions and the so-called loss function. However, for a majority of deep learning approaches based on DANNs, the kernel structure of neural signal processing remains the same, where the node response is encoded as a linear superposition of neural activity, while the non-linearity is triggered by the activation functions. In the current paper, we suggest to analyze the neural signal processing in DANNs from the point of view of homogeneous chaos theory as known from polynomial chaos expansion (PCE). From the PCE perspective, the (linear) response on each node of a DANN could be seen as a $1{st}$ degree multi-variate polynomial of single neurons from the previous layer, i.e. linear weighted sum of monomials. From this point of view, the conventional DANN structure relies implicitly (but erroneously) on a Gaussian distribution of neural signals. Additionally, this view revels that by design DANNs do not necessarily fulfill any orthogonality or orthonormality condition for a majority of data-driven applications. Therefore, the prevailing handling of neural signals in DANNs could lead to redundant representation as any neural signal could contain some partial information from other neural signals. To tackle that challenge, we suggest to employ the data-driven generalization of PCE theory known as arbitrary polynomial chaos (aPC) to construct a corresponding multi-variate orthonormal representations on each node of a DANN to obtain Deep arbitrary polynomial chaos neural networks.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (123)
  1. M. Abramowitz and A. Stegun, I. Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables. Dover Publications, Inc., New York, 1965.
  2. Solving ill-posed inverse problems using iterative deep neural networks. Inverse Problems, 33(12):124007, Nov 2017.
  3. C.C. Aggarwal. Neural Networks and Deep Learning: A Textbook. Springer, 1 edition, 9 2018.
  4. SAMBA: sparse approximation of moment-based arbitrary polynomial chaos. J. Comput. Phys., 320:1–16, 2016.
  5. NI Akhiezer. The classical moment problem. Hafner Publ. Co., New York, 2, 1965.
  6. Data-Driven Multi-Element Arbitrary Polynomial Chaos for Uncertainty Quantification in Sensors. IEEE Transactions on Magnetics, 54(3), 2017.
  7. M. Anthony and P.L. Bartlett. Neural Network Learning: Theoretical Foundations. Cambridge University Press, 1999.
  8. Deep voice: Real-time neural text-to-speech. In International Conference on Machine Learning, pages 195–204. PMLR, 2017.
  9. Unitary evolution recurrent neural networks. In International conference on machine learning, pages 1120–1128. PMLR, 2016.
  10. R. Askey and J.A. Wilson. Some Basic Hypergeometric Orthogonal Polynomials that Generalize Jacobi Polynomials. Number no. 319 in American Mathematical Society: Memoirs of the American Mathematical Society. American Mathematical Society, 1985.
  11. Optimum experimental designs, volume 5. Clarendon Press, 1992.
  12. Polynomial chaos for the approximation of uncertainties: Chances and limits. European Journal of Applied Mathematics, 19(2):149–190, 2008.
  13. Computer Vision. Prentice Hall Professional Technical Reference, 1st edition, 1982.
  14. João Carlos Alves Barata and Mahir Saleh Hussein. The Moore–Penrose pseudoinverse: A tutorial review of the theory. Brazilian Journal of Physics, 42(1):146–165, 2012.
  15. Bayesian calibration and validation of a large-scale and time-demanding sediment transport model. Water Resources Research, 56(7):e2019WR026966, 2020.
  16. Sparse polynomial chaos expansions and adaptive stochastic finite elements using a regression approach. C. R. Mécanique, 336(6):518–523, 2008.
  17. Weight uncertainty in neural network. In Francis Bach and David Blei, editors, Proceedings of the 32nd International Conference on Machine Learning, volume 37 of Proceedings of Machine Learning Research, pages 1613–1622, Lille, France, 07–09 Jul 2015. PMLR.
  18. Deep neural network concepts for background subtraction: A systematic review and comparative evaluation. Neural Networks, 117:8–66, 2019.
  19. A systematic study of the class imbalance problem in convolutional neural networks. Neural Networks, 106:249–259, 2018.
  20. The sparse polynomial chaos expansion: a fully bayesian approach with joint priors on the coefficients and global selection of terms. Journal of Computational Physics, 488:112210, 2023.
  21. The orthogonal development of non-linear functionals in series of Fourier-Hermite functionals. Ann. of Math. (2), 48:385–392, 1947.
  22. Surrogate models for uncertainty quantification: A unified bayesian framework. Computer Methods in Applied Mechanics and Engineering, page Under Review, 2021.
  23. P-nets: Deep polynomial neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7325–7335, 2020.
  24. Deep learning in video multi-object tracking: A survey. Neurocomputing, 381:61–88, 2020.
  25. A benchmark study on problems related to CO2 storage in geologic formations. Computational Geosciences, 13(4):409, 2009.
  26. Support-vector networks. Machine learning, 20(3):273–297, 1995.
  27. Noel AC Cressie. Spatial Prediction and Kriging. Statistics for Spatial Data (Cressie NAC, ed). New York: John Wiley & Sons, pages 105–209, 1993.
  28. Imagenet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition, pages 248–255, 2009.
  29. Li Deng. An overview of deep-structured learning for information processing. In Proc. Asian-Pacific Signal & Information Proc. Annual Summit & Conference (APSIPA-ASC), pages 1–14, October 2011.
  30. On the convergence of generalized polynomial chaos expansions. ESAIM Math. Model. Numer. Anal., 46(2):317–339, 2012.
  31. J. Foo and G.E. Karniadakis. Multi-element probabilistic collocation method in high dimensions. J. Comput. Phys., 229(5):1536–1557, 2010.
  32. Walter Gautschi. Orthogonal polynomials: computation and approximation. Numerical Mathematics and Scientific Computation. Oxford University Press, New York, 2004. Oxford Science Publications.
  33. Stochastic finite elements: A spectral approach. Springer-Verlag, New York, 1991.
  34. Markov chain Monte Carlo in practice. Chapmann & Hall, Boca Raton, 1996.
  35. Deep Learning. MIT Press, 2016. http://www.deeplearningbook.org.
  36. Daniel Graupe. Principles of artificial neural networks, volume 7 of Advanced series in circuits and systems. World Scientific Publishing Company, Singapore, 2013.
  37. M.H. Hassoun. Fundamentals of Artificial Neural Networks. A Bradford book. MIT Press, Cambridge, 1995.
  38. S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural computation, 9(8):1735–1780, 1997.
  39. Multilayer Feedforward Networks are Universal Approximators. Neural Networks, 2(5):359–366, 1989.
  40. Densely connected convolutional networks, 2018.
  41. Batch normalization: Accelerating deep network training by reducing internal covariate shift, 2015.
  42. T. Ishigami and T. Homma. An importance quantification technique in uncertainty analysis for computer models. In [1990] Proceedings. First International Symposium on Uncertainty Modeling and Analysis, pages 398–403, 1990.
  43. Cybernetics and forecasting techniques. 1967.
  44. An Introduction to Statistical Learning: With Applications in R. Springer Publishing Company, Incorporated, New York, 2014.
  45. Modeling physical uncertainties in dynamic stall induced fluid–structure interaction of turbine blades using arbitrary polynomial chaos. Computers and Structures, 85(11-14):866–878, 2007.
  46. Orthogonal deep neural networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43, 2019.
  47. Physics-guided machine learning for scientific discovery: An application in simulating lake temperature profiles. ACM/IMS Trans. Data Sci., 2(3), May 2021.
  48. Multivariate LSTM-FCNs for time series classification. Neural Networks, 116:237–245, 2019.
  49. S. Karlin. Total Positivity, volume I. Stanford University Press, Stanford, 1968.
  50. A. Keese and H. G. Matthies. Sparse quadrature as an alternative to Monte Carlo for stochastic finite element techniques. Proc. Appl. Math. Mech., 3:493–494, 2003.
  51. Foundations of the theory of probability: Second English Edition. Dover Publications, Inc., New York, 2018.
  52. Intrusive uncertainty quantification for hyperbolic-elliptic systems governing two-phase flow in heterogeneous porous media. Comput. Geosci., 21(4):807–832, 2017.
  53. Datasets and executables of data-driven uncertainty quantification benchmark in carbon dioxide storage.
  54. Comparison of data-driven uncertainty quantification methods for a carbon dioxide storage benchmark scenario. Computational Geosciences, Nov 2019.
  55. Daniel G Krige. A statistical approach to some basic mine valuation problems on the witwatersrand. Journal of the Southern African Institute of Mining and Metallurgy, 52(6):119–139, 1951.
  56. ImageNet Classification with Deep Convolutional Neural Networks. Commun. ACM, 60(6):84–90, May 2017.
  57. A stochastically and spatially adaptive parallel scheme for uncertain and nonlinear two-phase flow problems. Computational Geosciences, 19(2):269–284, 2015.
  58. Probabilistic collocation method for flow in porous media: Comparisons with other stochastic methods. Water Resources Research, 43(9):1–13, 2007.
  59. G. Lin and A.M. Tartakovsky. An efficient, high-order probabilistic collocation method on sparse grids for three-dimensional flow and solute transport in randomly heterogeneous porous media. Adv. Water Res., 32(5):712–722, 2009.
  60. David JC MacKay. Bayesian interpolation. Neural computation, 4(3):415–447, 1992.
  61. Donald W. Marquardt. An algorithm for least-squares estimation of nonlinear parameters. Journal of the Society for Industrial and Applied Mathematics, 11(2):431–441, 1963.
  62. MATLAB. version 9.7.0.1216025 (r2019b). https://www.mathworks.com/help/stats/fitrgp.html, 2019.
  63. John McCarthy. Review of the question of artificial intelligence. Annals of the History of Computing, 10(3):224–229, 1988.
  64. A logical calculus of the ideas immanent in nervous activity. The bulletin of mathematical biophysics, 5(4):115–133, 1943.
  65. Efficient orthogonal parametrisation of recurrent neural networks using householder reflections. In International Conference on Machine Learning, pages 2401–2409. PMLR, 2017.
  66. How to choose an activation function. In Advances in Neural Information Processing Systems, pages 319–326, Denver, 1994.
  67. Evolving deep neural networks. In Artificial intelligence in the age of neural networks and brain computing, pages 293–312. Elsevier, 2019.
  68. Eliakim H Moore. On the reciprocal of the general algebraic matrix. Bull. Am. Math. Soc., 26:394–395, 1920.
  69. Deep learning applications and challenges in big data analytics. Journal of big data, 2(1):1–21, 2015.
  70. Deep double descent: Where bigger models and more data hurt. Journal of Statistical Mechanics: Theory and Experiment, 2021(12):124003, 2021.
  71. Hayrettin Okut. Bayesian regularized neural networks for small n big p data. Artificial neural networks-models and applications, 2016.
  72. A concept for data-driven uncertainty quantification and its application to carbon dioxide storage in geological formations. Adv. Water Res., 34:1508–1518, 2011.
  73. An integrative approach to robust design and probabilistic risk assessment for CO22{}_{2}start_FLOATSUBSCRIPT 2 end_FLOATSUBSCRIPT storage in geological formations. Comput. Geosci., 15(3):565–577, 2011.
  74. Global sensitivity analysis: a flexible and efficient framework with an example from stochastic hydrogeology. Advances in Water Resources, 37:10–22, 2012.
  75. S. Oladyshkin and W. Nowak. Data-driven uncertainty quantification using the arbitrary polynomial chaos expansion. Reliab. Eng. Syst. Safe., 106:179–190, 2012.
  76. Sergey Oladyshkin. aPC matlab toolbox: Data-driven arbitrary polynomial chaos. https://www.mathworks.com/matlabcentral/fileexchange/72014-apc-matlab-toolbox-data-driven-arbitrary-polynomial-chaos, 2022.
  77. Sergey Oladyshkin. DaPC NN: Deep arbitrary polynomial chaos neural network. https://www.mathworks.com/matlabcentral/fileexchange/112110-dapc-nn-deep-arbitrary-polynomial-chaos-neural-network, 2022.
  78. Bayesian33{}^{3}start_FLOATSUPERSCRIPT 3 end_FLOATSUPERSCRIPT Active Learning for the Gaussian Process Emulator Using Information Theory. Entropy, 22(8):890, 2020.
  79. Incomplete statistical information limits the utility of high-order polynomial chaos expansions. Reliability Engineering & System Safety, 169:137–148, 2018.
  80. The Connection between Bayesian Inference and Information Theory for Model Selection, Information Gain and Experimental Design. Entropy, 21(11):1081, 2019.
  81. Challenges in Markov chain Monte Carlo for Bayesian neural networks, 2021.
  82. Roger Penrose. On best approximate solutions of linear matrix equations. In Mathematical Proceedings of the Cambridge Philosophical Society, volume 52, pages 17–19. Cambridge University Press, 1956.
  83. Learning groundwater contaminant diffusion-sorption processes with a finite volume neural network. Water Resources Research, page e2022WR033149, 2022.
  84. Improving thermochemical energy storage dynamics forecast with physics-inspired neural network architecture. Energies, 13(15):3873, 2020.
  85. Deep convolutional neural networks for image classification: A comprehensive review. Neural computation, 29(9):2352–2449, 2017.
  86. JR Red-Horse and AS Benjamin. A probabilistic approach to uncertainty quantification with limited information. Reliability Engineering & System Safety, 85(1):183–190, 2004.
  87. B-splines on sparse grids for surrogates in uncertainty quantification. Reliability Engineering & System Safety, 209:107430, 2021.
  88. Sebastian Ruder. An overview of gradient descent optimization algorithms, 2017.
  89. Carl Runge. Über empirische Funktionen und die Interpolation zwischen äquidistanten Ordinaten. Zeitschrift für Mathematik und Physik, 46(224-243):20, 1901.
  90. Arthur L Samuel. Some studies in machine learning using the game of checkers. IBM Journal of research and development, 3(3):210–229, 1959.
  91. Juergen Schmidhuber. Annotated history of modern ai and deep learning, 2022. Technical Report IDSIA-22-22.
  92. Jürgen Schmidhuber. Deep learning in neural networks: An overview. Neural networks, 61:85–117, 2015.
  93. Burr Settles. Active learning literature survey. Computer Sciences Technical Report 1648, University of Wisconsin–Madison, 2009.
  94. Activation functions in neural networks. Towards Data Science, 6(12):310–316, 2017.
  95. An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE transactions on pattern analysis and machine intelligence, 39(11):2298–2304, 2016.
  96. JA Shohat and JD Tamarkin. The problem of moments, mathematical surveys no. 1. American Mathematical Society, New York, 1950, 1943.
  97. Pcenet: High dimensional surrogate modeling for learning uncertainty, 2022.
  98. W. McC. Siebert. On the determinants of moment matrices. Ann. Statist., 17(2):711–721, 1989.
  99. Bayesian statistics without tears: a sampling–resampling perspective. The American Statistician, 46(2):84–88, 1992.
  100. Construction and comparison of high-dimensional Sobol’ generators. Wilmott, 2011(56):64–79, 2011.
  101. Il’ya Meerovich Sobol’. On sensitivity estimation for nonlinear mathematical models. Matematicheskoe modelirovanie, 2(1):112–118, 1990.
  102. J Stieltjes, T. Quelques recherches sur la théorie des quadratures dites méchaniques. Oeuvres I, pages 377–396, 1884.
  103. Bruno Sudret. Global sensitivity analysis using polynomial chaos expansions. Reliability Engineering & System Safety, 93(7):964–979, 2008. Bayesian Networks in Dependability.
  104. T.J. Sullivan. Introduction to Uncertainty Quantification. Texts in Applied Mathematics. Springer International Publishing, Cham, 2015.
  105. Image denoising using deep CNN with batch renormalization. Neural Networks, 121:461–473, 2020.
  106. Solutions of ill-posed problems. Vh Winston, 1977.
  107. Michael E Tipping. The relevance vector machine. In Advances in neural information processing systems, pages 652–658, 2000.
  108. Theory of pattern recognition. Nauka, Moscow, 1974.
  109. Solution of differential equation models by polynomial approximation. Prentice-Hall, New Jersey, 1978.
  110. Solution of differential equation models by polynomial approximation(book). Englewood Cliffs, N. J., Prentice-Hall, Inc., 1978. 460 p, 1978.
  111. On orthogonality and learning recurrent networks with long term dependencies. In International Conference on Machine Learning, pages 3570–3578. PMLR, 2017.
  112. Orthogonal convolutional neural networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 11505–11515, 2020.
  113. Norbert Wiener. The homogeneous chaos. American Journal of Mathematics, 60(4):897–936, 1938.
  114. Norbert Wiener. Cybernetics, or Control and Communication in the Animal and the Machine. Actualités Scientifiques et Industrielles [Current Scientific and Industrial Topics], No. 1053. Hermann et Cie., Paris; The Technology Press, Cambridge, Mass.; John Wiley & Sons, Inc., New York, 1948.
  115. Gaussian processes for machine learning, volume 2. MIT press Cambridge, MA, 2006.
  116. Full-capacity unitary recurrent neural networks. Advances in neural information processing systems, 29, 2016.
  117. Nonlinear recurrent neural networks for finite-time solution of general time-varying linear matrix equations. Neural Networks, 98:102–113, 2018.
  118. D. Xiu and G. E. Karniadakis. Modeling uncertainty in flow simulations via generalized polynomial chaos. Journal of Computational Physics, 187:137–167, 2003.
  119. D. Xiu and G.E. Karniadakis. The Wiener-Askey Polynomial Chaos for stochastic differential equations. SIAM Journal of Scientific Computing, 24(2):619–644, 2002.
  120. The Wiener–Askey polynomial chaos for stochastic differential equations. SIAM Journal on Scientific Computing, 24(2):619–644, 2002.
  121. P. Yee and S. Haykin. Pattern classification as an ill-posed, inverse problem: a regularization approach. In 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing, volume 1, pages 597–600 vol.1, 1993.
  122. Evaluation of multiple reduced-order models to enhance confidence in global sensitivity analyses. Int. J. Greenh. Gas Control, 49:217–226, 2016.
  123. Mini-data-driven Deep Arbitrary Polynomial Chaos Expansion for Uncertainty Quantification, 2021.
Citations (2)

Summary

We haven't generated a summary for this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube